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C YANOBACTERIAL AND PLANT ACETYI^CoA CARBOXYLASE 

Description 

Technical Field of the Invention 

The present invention relates to polynucleotides and 
polypeptides of acetyl-CoA carboxylase in cyanobacteria and plants. 
Polynucleotides encoding acetyl-CoA carboxylase have use in conferring 
herbicide resistance and in determining the herbicide resistance of plants in a 
breeding program. 

Background of the Invention 

Acetyl-CoA carboxylase (ACC) is the first enzyme of the 
biosynthetic pathway to fatty acids. It belongs to a group of carboxylases that 
use biotin as cofactor and bicarbonate as a source of the carboxyl group. 
ACC catalyzes the addition of C0 2 to acetyl-CoA to yield malonyl-CoA in 
two steps as shown below. 

BCCP + ATP + HCO.3 — BCCP-C0 2 + ADP + P 4 (1) 

BCCP-C0 2 4- Acetyl-CoA -» BCCP + malonyl-CoA (2) 

First, biotin becomes carboxylated at the expense of ATP. 
The carboxyl group is then transferred to Ac-CoA [Knowles, 1989]. This 
irreversible reaction is the committed step in fatty acid synthesis and is a 
target for multiple regulatory mechanisms. Reaction (1) is catalyzed by 
biotin carboxylase (BC); reaction (2) by transcarboxylase (TC); BCCP = 
biotin carboxyl carrier protein. 

ACC purified from E.coli contains three distinct, separable 
components.: biotin carboxylase (BC), a dimer of 49-kD monomers, biotin 
carboxyl carrier protein (BCCP) a dimer of 17-kD monomers and 
transcarboxylase (TC), a tetramer containing two each of 33-kD and 35-kD 
subunits. The biotin prosthetic group is covalently attached to the 7-amino 
group of a lysine residue of BCCP. The primary structure of E.coli BCCP 
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and BC is known (fabE and fabG genes, respectively, have been cloned and 
sequenced) [Alix, 1989; Maramatsu, et al., 1989; Li, et al., 1992]. In 
bacteria, fatty acids are primarily precursors of phospholipids rather than 
storage fuels, and so ACC activity is coordinated with cell growth and 
division. 

Rat and chicken ACC consist of a dimer of about 265 kD (rat 
has also a 280 kD isoform) subunits that contains all of the bacterial enzyme 
activities. Both mammalian and avian ACC are cytoplasmic enzymes and 
their substrate is transported out of mitochondria via citrate. ACC content 
and/or activity varies with the rate of fatty acid synthesis or energy 
requirements in different nutritional, hormonal and developmental states. 
ACC mRNA is transcribed using different promoters and can be regulated 
by alternative splicing. ACC catalytic activity is regulated allosterically by a 
number of metabolites and by reversible phosphorylation of the enzyme. 
The primary structure of rat and chicken enzymes, and the primary structure 
of the 5* -untranslated region of mRNA have been deduced from cDNA 
sequences [Lopez-Casillas, et al., 1988; Takai, et al., 1988]. The primary 
structure of yeast ACC has also been determined [Feel, et al., 1992]. 

Studies on plant ACC are far less advanced [Harwood, 1988]. 
It was originally thought that plant ACC consisted of low molecular weight 
dissociable subunits similar to those of bacteria. Those results appeared to 
be due to degradation of the enzyme during purification. More recent results 
indicate that the wheat enzyme, as well as those from parsley and rape, are 
composed of two about 220 kD monomers, similar to the enzyme from rat 
and chicken [Harwood, 1988; Egin-Buhler, et al., 1983; Wurtelle, et al., 
1990; Slabas, et al., 1985]. The plant ACC is located entirely in the stroma 
of plastids, where all plant fatty acid synthesis occurs. No plant gene 
encoding ACC has been reported to date. The gene must be nuclear because 
no corresponding sequence is seen in the complete chloroplast DNA 
sequences of tobacco, liverwort or rice. ACC, like the vast majority of 
chloroplast proteins which are encoded in nuclear DNA, must be synthesized 
in the cytoplasm and then transported into the chloroplast, probably requiring 
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a chloroplast transport sequence. Although the basic features of plant ACC 
must be the same as those of prokaryouc and other eucaryotic ACCs, 
significant differences can be also expected, due, for example, to differences 
in plant cell metabolism and ACC cellular localization. 

Structural similarities deduced from the available amino acid 
sequences suggest strong evolutionary conservation among biotin 
carboxylases and biotin carboxylase domains of all biotin-dependent 
carboxylases. On the contrary, the BCCP domains show very little 
conservation outside the sequence E(A/V)MKM (lysine residue is 
biotinylated) which is found in all biotinylated proteins including pyruvate 
carboxylase and propionyl-CoA carboxylase [Knowles, 1989; Samols, et al., 
1988]. It is likely that the three functional domains of ACC located in E.coli 
on separate polypeptides are present in carboxylases containing two (human 
propionyl-CoA carboxylase) or only one (yeast pyruvate carboxylase, 
mammalian, avian and probably also plant ACC) polypeptide as a result of 
gene fusion during evolution. 

Several years ago it was shown that 
aryloxyphenoxypropionates and cyclohexanediones, powerful herbicides 
effective against monocot weeds, inhibit fatty acid biosynthesis in sensitive 
plants. Recently it has been determined that ACC is the target enzyme for 
both of these classes of herbicide. Dicotyledonous plants are resistant to 
these compounds, as are other eukaryotes and prokaryotes. The mechanisms 
of inhibition and resistance of the enzyme are not known [Lichtenthaler, 
1990]. 

It has occurred to others that the evolutionary relatedness of 
cyanobacteria and plants make the former useful sources of cloned genes for 
the isolation of plant cDNAs. For example, Pecker et al used the cloned 
gene for the enzyme phytoene desaturase, which functions in the synthesis of 
carotenoids, from cyanobacteria as a probe to isolate the cDNA for that gene 
from tomato [Pecker, et al., 1992]. 

Brief Summary of the I nvention 
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In one aspect the present invention provides an isolated and 
purified polynucleotide of from about 1350 to about 40,000 base pairs that 
encodes a polypeptide having the ability to catalyze the carboxylation of a 
biotin carboxyl carrier protein of a cyanobacterium. Preferably, that 
5 polypeptide is a subunit of acetyl-CoA carboxylase and participates in the 

carboxylation of acetyl-CoA. In a preferred embodiment, a cyanobacterium 
is Anabaena or Synechococcus. The biotin carboxyl carrier protein 
preferably includes the amino acid residue sequence shown in SEQ ID 
NO: 111 or a functional equivalent thereof. 
10 In another preferred embodiment, the polypeptide has the 

amino acid residue sequence of Figure 1 or Figure 2. The polynucleotide 
preferably includes the DNA sequence of SEQ ID NO:l, the DNA sequence 
of SEQ ID NO: 1 from about nucleotide position 1300 to about nucleotide 
position 2650 or the DNA sequence of SEQ ID NO:5. 

15 In another aspect, the present invention provides an isolated 

and purified polynucleotide of from about 480 to about 40,000 base pairs 
that encodes a biotin carboxyl carrier protein of a cyanobacterium and, 
preferably Anabaena. The biotin carboxyl carrier protein preferably includes 
the amino acid residue sequence of SEQ ID NO: 1 1 1 and the polynucleotide 

20 preferably includes the DNA sequence of SEQ ID NO: 110. 

Another polynucleotide provided by the present invention 
encodes a plant polypeptide having the ability to catalyze the carboxylation 
of acetyl-CoA. A plant polypeptide is preferably (1) a monocotyledonous 
plant polypeptide such as a wheat, rice, maize, barley, rye, oats or timothy 

25 grass polypeptide or (2) a dicotyledonous plant polypeptide such as a 

soybean, rape, sunflower, tobacco, Arabiodopsis, petunia, Canola, pea, 
bean, tomato, potato, lettuce, spinach, alfalfa, cotton or carrot polypeptide. 
Preferably, that polypeptide is a subunit of ACC and participates in the 
•it carboxylation of acetyl -Co A. 

30 Such a polynucleotide preferably includes the nucleotide 

sequence of SEQ ID NO: 108 and encodes the amino acid residue sequence 
of SEQ ID NO: 109. 



V 
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In yet another aspect, the present invention provides an 
isolated and purified DNA molecule comprising a promoter operatively 
linked to a coding region that encodes (1) a polypeptide having the ability to 
catalyze the carboxylation of a biotin carboxyl carrier protein of a 
5 cyanobacterium, (2) a biotin carboxyl carrier protein of a cyanobacterium or 

(3) a plant polypeptide having the ability to catalyze the carboxylation of 
acetyl-CoA, which coding region is operatively linked to a transcription - 
terminating region, whereby said promoter drives the transcription of said 
coding region. 

10 In another aspect, the present invention provides an isolated 

polypeptide having the ability to catalyze the carboxylation of a biotin 
carboxyl carrier protein of a cyanobacterium such as Anabaena or 
Synechococcus. Preferably a biotin carboxyl carrier protein includes the 
amino acid sequence of SEQ ID NO: 111 and the polypeptide has the amino 

15 acid residue sequence of Figure 1 or Figure 2. 

The present invention also provides (1) an isolated and 
purified biotin carboxyl carrier protein of a cyanobacterium such as 
Anabaena, which protein includes the amino acid residue sequence of SEQ 
ID NO: 1 1 1 and (2) an isolated and purified plant polypeptide having a 

20 molecular weight of about 220 kD, dimers of which have the ability to 

catalyze the carboxylation of acetyl-CoA. 

In yet another aspect, the present invention provides a process 
of increasing the herbicide resistance of a monocotyledonous plant 
comprising transforming the plant with a DNA molecule comprising a 

25 promoter operatively linked to a coding region that encodes a herbicide 

resistant polypeptide having the ability to catalyze the carboxylation of 
acetyl-CoA, which coding region is operatively linked to a transcription- 
terminating region, whereby the promoter is capable of driving the 
transcription' of the coding region in a monocotyledonous plant. 

30 Preferably, a polypeptide is an acetyl-CoA carboxylase 

enzyme and, more preferably, a dicotyledonous plant acetyl-CoA 
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carboxylase. In a preferred embodiment, a coding region includes the DNA 
sequence of SEQ ID NO: 108 and a promoter is CaMV35. 

The present invention also provides a transformed plant 
produced in accordance with the above process as well as a transgenic plant 
5 and a transgenic plant seed having incorporated into its genome a transgene 
that encodes a herbicide resistant polypeptide having the ability to catalyze 
the carboxylation of acetyl-CoA. 

In yet another aspect, the present invention provides a process 
of altering the carboxylation of acetyl-CoA in a cell comprising transforming 
10 the cell with a DNA molecule comprising a promoter operatively linked to a 

coding region that encodes a plant polypeptide having the ability to catalyze 
the carboxylation of acetyl-CoA, which coding region is operatively linked to 
a transcription-terminating region, whereby the promoter is capable of 
driving the transcription of the coding region in the cell. 
15 In a preferred embodiment, a cell is a cyanobacterium or a 

plant cell and a plant polypeptide is a monocotyledonous plant acetyl-CoA 
carboxylase enzyme such as wheat acetyl-CoA carboxylase enzyme. The 
present invention also provides a transformed cyanobacterium produced in 
accordance with such a process. 
20 The present invention still further provides a process for 

determining the inheritance of plant resistance to herbicides of the 
aryloxyphenocypropionate or cyclohexanedione class, which process 
comprises the steps of: 

(a) measuring resistance to herbicides of the 

25 aryloxyphenocypropionate or cyclohexanedione class in a parental plant line 

and in progeny of the parental plant line; 

(b) purifying DNA from said parental plant line and the 

progeny; 

(c) digesting the DNA with restriction enzymes to form 
30 DNA fragments; 

(d) fractionating the fragments on a gel; 

(e) transferring the fragments to a filter support; 



(f) annealing the fragments with a labelled RFLP probe 
consisting of a DNA molecule that encodes acetyl-CoA carboxylase or a 
portion thereof; and 

(g) detecting the presence of complexes between the 
fragments and the RFLP probe; and 

(h) correlating the herbicide resistance of step (a) with the 
complexes of step (g) and thereby the inheritance of herbicide resistance. 

Preferably, the acetyl-CoA carboxylase is a dicotyledonous 
plant acetyl-CoA carboxylase enzyme or a mutated monocotyledonous plant 
acetyl-CoA carboxylase that confers herbicide resistance or a hybrid acetyl- 
CoA carboxylase comprising a portion of a dicotyledonous plant acetyl-CoA 
carboxylase, a portion of a dicotyledonous plant acetyl-CoA carboxylase or 
one or more domains of a cyanobacterial acetyl-CoA carboxylase. 

In still yet another aspect, the present invention provides a 
process for identifying herbicide resistant variants of a plant acetyl-CoA 
carboxylase comprising the steps of: 

(a) transforming cyanobacteria with a DNA molecule that 
encodes a monocotyledonous plant acetyl-CoA carboxylase enzyme to form 
transformed cyanobacteria; 

(b) inactivating cyanobacterial acetyl-CoA carboxylase; 

(c) exposing the transformed cyanobacteria to a herbicide 
that inhibits acetyl-CoA carboxylase activity; 

(d) identifying transformed cyanobacteria that are resistant 
to the herbicide; and 

(e) characterizing DNA that encodes acetyl-CoA 
carboxylase from the cyanobacteria of step (d). 



Brief Description of the Drawings 

In the drawings which form a portion of the specification: 
Figure 1 shows the complete nucleotide sequence of a Hindlll 

fragment that includes the fabG gene coding biotin carboxylase from the 



cyanobacterium Anabaena 7120, along with the amino acid sequence 
deduced from the coding sequence of the DNA. 

Figure 2 shows the nucleotide sequence of the coding region 
of the fabG gene from the cyanobacterium Anacystis nidulans R2, along with 
the amino acid sequence deduced from the coding sequence of the DNA. 

Figure 3 shows an alignment of the amino acid sequences of 
the BC proteins from both cyanobacteria and from E. coli t the BCCP 
proteins from Anabaena and from E. coli, along with the ACC enzymes 
from rat and chicken and several other biotin-containing carboxylases. Stars 
indicate positions that are identical in all sequences or all but one. The 
conventional one letter abbreviations for amino acids are used. The BC 
domains are indicated by a solid underline, the BCCP domains by a dashed 
underline. The symbol # indicates sequences not related to BC and, 
therefore, not considered in the alignment. The wheat ACC sequence 
deduced from the sequence of our cloned cDNA fragment is on the top line. 
Abbreviations used in the Figure are: Wh ACC, wheat ACC; Rt, rat; Ch, 
chicken; Yt, yeast; Sy ACC, Synechococcus BC; An ACC, Anabaena BC 
and BCCP proteins; EC ACC, E. coli BC and BCCP; Hm PCCA, human 
propionyl CoA carboxylase; Rt PCCA, rat propionyl CoA carboxylase; Yt 
PC, yeast pyruvate carboxylase. 

Figure 4 shows the conserved amino acid sequences used to 
design primers for the PCR to amplify the BC domain of ACC from wheat. 
The sequences of the oligonucleotide primers are also shown. In this and 
other figures showing primer sequences, A means adenine, C means 
cytosine, G means guanine, T means thymine, N means all four nucleotides, 
Y means T or C, R means A or G, K means G or T, M means A or C, W 
means A or T, and H means A,C or T. 

Figure 5 shows the sequences of the oligonucleotides used as 
primers for the PCR used to amplify the region of ; wheat ACC cDNA 
between the BC and BCCP domains. 

Figure 6 shows the nucleotide sequence of a portion of the 
wheat cDNA corresponding to ACC. The amino acid sequence deduced 



from the nucleotide sequence is also shown. The underlined sequences 
correspond to the primer sites shown in Figure 5. A unique sequence was 
found for the BC domain, suggesting that a single mRNA was the template 
for the final amplified products. For the sequence between the BC and 
BCCP domains, three different variants were found among four products 
sequenced, suggesting that three different gene transcripts were among the 
amplified products. This is not unexpected because wheat is hexaploid, i.e. 
it has three pairs of each chromosome. 

Figure 7 shows the sequences of the oligonucleotides used as 
primers to amplify most of the fabE gene encoding the biotin carboxyl 
carrier protein from DNA of Anabaena. 

Figure 8 shows the nucleotide sequence of a PCR product 
corresponding to a portion of the fabE gene encoding about 75% of the 
biotin carboxyl carrier protein from the cyanobacterium Anabaena, along 
with the amino acid sequence deduced from the coding sequence. The 
underlined sequences correspond to the primer sites shown in Figure 7. 

Detailed Description pf the JnventiPn 
I. Definitions 

The following words and phrases have the meanings set forth 

below. 

Expression: The combination of intracellular processes, 
including transcription and translation undergone by a coding DNA molecule 
such as a structural gene to produce a polypeptide. 

Promoter: A recognition site on a DNA sequence or group of 
DNA sequences that provide an expression control element for a structural 
gene and to which RNA polymerase specifically binds and initiates RNA 
synthesis (transcription) of that gene. 

r "Regeneration: The process of growing a plant from a plant 
cell (e.g. plant protoplast or explant). 

Structural gene: A gene that is expressed to produce a 

polypeptide. 
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Transformation: A process of introducing an exogenous DNA 
sequence (e.g. a vector, a recombinant DNA molecule) into a cell or 
protoplast in which that exogenous DNA is incorporated into a chromosome 
or is capable of autonomous replication. 
5 Transformed cell: A cell whose DNA has been altered by the 

introduction of an exogenous DNA molecule into that cell. 

Transgenic cell: Any ceil derived or regenerated from a 
transformed cell or derived from a transgenic cell. Exemplary transgenic 
cells include plant calli derived from a transformed plant cell and particular 
10 cells such as leaf, root, stem, e.g. somatic cells, or reproductive (germ) cells 
obtained from a transgenic plant. 

Transgenic plant: A plant or progeny thereof derived from a 
transformed plant cell or protoplast, wherein the plant DNA contains an 
introduced exogenous DNA molecule not originally present in a native, non- 
15 transgenic plant of the same strain. The terms "transgenic plant" and 

"transformed plant" have sometimes been used in the art as synonymous 
terms to define a plant whose DNA contains an exogenous DNA molecule. 
However, it is thought more scientifically correct to refer to a regenerated 
plant or callus obtained from a transformed plant cell or protoplast as being 
20 a transgenic plant, and that usage will be followed herein. 

Vector: A DNA molecule capable of replication in a host cell 
and/or to which another DNA segment can be operatively linked so as to 
bring about replication of the attached segment. A plasmid is an exemplary 
vector. 

25 Certain polypeptides are disclosed herein as amino acid 

residue sequences. Those sequences are written left to right in the direction 
from the amino to the carboxy terminus. In accordance with standard 
nomenclature, amino acid residue sequences are denominated by either a 
single letter or a three letter code as indicated below. 



30 
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Amino Acid Residue 3-Letter Code 1 -Letter Code 



Alanine 


Ala 


A 


Areinine 


Are 


R 


AsDarafiine 


Asn 


N 




Asp 


D 


Cv^tPlTlP 


Cvs 


C 


U1U UlIlllilC 


Gin 

VJ All 


O 


VJiUuu.il 11V^ rtHU 


Glu 


E 


vjiy umc 


Glv 


G 


J-TiQtiriinp 


His 


H 


isoieucine 


He 


I 


Leucine 


Leu 


L 


Lysine 


Lys 


K 


Methionine 


Met 


M 


Phenylalanine 


Phe 


F 


Proline 


Pro 


P 


Serine 


Ser 


S 


Threonine 


Thr 


T 


Tryptophan 


Trp 


W 


Tyrosine 


Tyr 


Y 


Valine 


Val 


V 



The present invention provides polynucleotides and 
25 polypeptides relating to a whole or a portion of acetyl-CoA carboxylase 

(ACC) of cyanobacteria and plants as well as processes using those 
polynucleotides and polypeptides. 

IL Polynucleotides 
30 As used herein the term "polynucleotide*' means a sequence of 

nucleotides connected by phosphodiester linkages. A polynucleotide of the 
present invention can comprise from about 2 to about several hundred 
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thousand base pairs. Preferably, a polynucleotide comprises from about 5 to 
about 150,000 base pairs. Preferred lengths of particular polynucleotides are 
set hereinafter. 

A polynucleotide of the present invention can be a 
5 deoxyribonucleic acid (DNA) molecule or a ribonucleic acid (RNA) 

molecule. Where a polynucleotide is a DNA molecule, that molecule can be 
a gene or a cDNA molecule. Nucleotide bases are indicated herein by a 
single letter code: adenine (A), guanine (G), thymine (T), cytosine (C), and 
uracil (U). 

10 

A. Cvanobacteria 
In one embodiment, the present invention contemplates an 
isolated and purified polynucleotide of from about 1350 to about 40,000 base 
pairs that encodes a polypeptide having the ability to catalyze the 
15 carboxylation of a biotin carboxyl carrier protein of a cyanobacterium. 

Preferably, a biotin carboxyl carrier protein (BCCP) is derived 
from a cyanobacterium such as Anabaena or Synechococcus. A preferred 
Anabaena is Anabaena 7120. A preferred Synechococcus is Anacystis 
nidulans R2 (Synechococcus sp. strain pcc7942). A biotin carboxyl carrier 
20 protein preferably includes the amino acid residue sequence shown in SEQ 

ID NO: 111 or a functional equivalent thereof. 

Preferably, a polypeptide is a biotin carboxylase enzyme of a 
cyanobacterium, which enzyme is a subunit of acetyl-CoA carboxylase and 
participates in the carboxylation of acetyl-CoA. In a preferred embodiment, 
25 a polypeptide encoded by such a polynucleotide has the amino acid residue 

sequence of Figure 1 or Figure 2, or a functional equivalent of those 
sequences. 

A polynucleotide preferably includes the DNA sequence of 
SEQ ID NO:l (Figure 1) or the DNA sequence of SEQ ID NO:l (Figure 1) 
30 from about nucleotide position 1300 to about nucleotide position 2650. 

The polynucleotide of SEQ ID NO: 1 contains a gene that 
encodes the enzyme biotin carboxylase (BC) enzyme from the 
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cyanobacterium Anabaena. This gene was cloned in the following way: total 
DNA from Anabaena was digested with various restriction enzymes, 
fractionated by gel electrophoresis, and blotted onto GeneScreen Plus 
(DuPont). The blot was hybridized at low stringency (1 M NaCl, 57° C.) 
5 with a probe consisting of a Sstll-PstI fragment containing about 90% of the 

coding region of the fabG gene from E. coli. This probe identified a 3.1-kb 
Hindlll fragment in the Anabaena digest that contained similar sequences. A 
mixture of about 3-kb Hindlll fragments of Anabaena DNA was purified, 
then digested with Nhel, yielding a Hindlll-Nhel fragment of 1.6 kb that 

10 hybridized with thefabG probe. The 1.6-kb region was purified by gel 

electrophoresis and cloned into pUC18. 

Plasmid minipreps were made from about 160 colonies, of 
which four were found to contain the 1.6-kb Hindlll-Nhel fragment that 
hybridized with the fabG probe. The 1.6-kb Anabaena fragment was then 

15 used as probe to screen, at high stringency (1 M NaCl, 65° C), a cosmid 

library of Anabaena DNA inserts averaging 40 kb in size. Five were found 
among 1920 tested, all of which contained the same size Hindlll and Nhel 
fragments as those identified by the E. coli probe previously. From one of 
the cosmids, the 3.1-kb Hindlll fragment containing the Anabaena fabG gene 

20 was subcloned into pUC18 and sequenced using the dideoxy chain 

termination method. The complete nucleotide sequence of this fragment is 
shown in Figure 1. 

A similar procedure was used to clone thefabG gene from 
Synechococcus. In this case, the initial Southern hybridization showed that 

25 the desired sequences were contained in part on an 0. 8-kb BamHI-PstI 

fragment. This size fragment was purified in two steps and cloned into the 
plasmid Bluescript KS. Minipreps of plasmids from 200 colonies revealed 
two that contained the appropriate fragment of Synechococcus DNA. This 
fragment was used to probe, at high stringency, a library of Synechococcus 

30 inserts in the cosmid vector pWB79. One positive clone was found among 

1728 tested. This cosmid contained a 2-kb BamHI and a 3-kb PstI fragment 
that had previously been identified by the E. coli fabG probe in digests of 
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total Synechococcus DNA. Both fragments were subcloned from the cosmid 
into Bluescript KS and 2.4 kb, including the coding part of thcfabG gene, 
were sequenced. The complete sequence of the coding region of the 
Synechococcus fabG gene is shown in Figure 2. 

In another aspect, the present invention provides an isolated 
and purified polynucleotide of from about 480 to about 40,000 base pairs 
that encodes a biotin carboxyl carrier protein of a cyanobacterium. That 
biotin carboxyl carrier protein preferably includes the amino acid residue 
sequence of Figure 8 (SEQ ID NO: 111) or a functional equivalent thereof. 
A preferred polynucleotide that encodes that polypeptide includes the DNA 
sequence of SEQ ID NO: 110 (Figure 8). 

B. Plants 

Another polynucleotide contemplated by the present invention 
encodes a plant polypeptide having the ability to catalyze the carboxylation 
of acetyl-CoA. Such a plant polypeptide is preferably a monocotyledonous 
or a dicotyledonous plant acetyl-CoA carboxylase enzyme. 

An exemplary and preferred monocotyledonous plant is wheat, 
rice, maize, barley, rye, oats or timothy grass. An exemplary and preferred 
dicotyledonous plant is soybean, rape, sunflower, tobacco, Arabidopsis, 
petunia, pea, Canola, bean, tomato, potato, lettuce, spinach, alfalfa, cotton 
or carrot. 

A monocotyledonous plant polypeptide is preferably wheat 
ACC, which ACC includes the amino acid residue sequence of SEQ ID 
NO: 109 (Figure 6) or a functional equivalent thereof. A preferred 
polynucleotide that encodes such a polypeptide includes the DNA sequence 
of SEQ ID NO: 108 (Figure 6). 

Amino acid sequences of biotin carboxylase (BC) from 
Anabaena and Synechococcus show great similarity with amino acid residue 
sequences from other ACC enzymes as well as with the amino acid residue 
sequences of other biotin-containing enzymes (See Figure 3). Based on that 
homology, the nucleotide sequences shown in Figure 4 were chosen for the 
construction of primers for polymerase chain reaction amplification of a 
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corresponding region of the gene for ACC from wheat. Those primers have 
the nucleotide sequences shown below: 
Primer 1 

5* TCGAATTCGTNATNATHAARGC 3' (SEQ ID NO: 112); 
Primer 2 

5' GCTCTAGAGKRTGYTCNACYTG 3' (SEQ ID NO: 113); 
where N is A, C, G or T; H is A, C or T; R is A or G; Y is T or C and K 
is G or T. Primers 1 and 2 comprise a 14-nucleotide specific sequence 
based on a conserved amino acid sequence and an 8-nucleotide extension at 
the 5 '-end of the primer to provide anchors for rounds of amplification after 
the first round and to provide convenient restriction sites for analysis and 
cloning. 

cDNA amplification began with a preparation of total poly A- 
containing mRNA from eight day-old green plants (Triticum aestivum var. 
Era as described in [Lamppa, et al., 1992]). The first strand of cDNA was 
synthesized using random hexamers as primers for AMV reverse 
transcriptase following procedures described in [Haymerle, et ah, 1986], 
with some modifications. Reverse transcriptase was inactivated by heat and 
low molecular weight material was removed by filtration. 

The PCR was initiated by the addition of polymerase at 95°C. 
Amplification was for 45 cycles, each 1 min at 95°, 1 min at 42-46° and 2 
min at 72° C. Both the reactions using Anabaena DNA and the single- 
stranded wheat cDNA as template yielded about 440 base pair (bp) products. 
The wheat product was eluted from a gel and reamplified using the same 
primers. That product, also 440 bp, was cloned into the Invitrogen (San 
Diego, CA) vector pCRlOOO using their A/T tail method, and sequenced. 

In eukaryotic ACCs, a BCCP domain is located about 300 
amino acids away from the end of the BC domain, on the C-terminal side. 
Therefore, it is possible to amplify the cDNA covering the interval between 
the BC and BCCP domains using primers from the C-terminal end of the BC 
domain and the conserved MKM region of the BCCP. The BC primer was 
based on the wheat cDNA sequence obtained as described above. Those 
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primers, each with 6- or 8-base 5'-extensions, are shown below and in 
Figure 5. 

Primer 3 

5' GCTCTAGAATACTATTTCCTG 3* (SEQ ID NO: 114) 
Primer 4 

5* TCGAATTCWNCATYTTCATNRC 3' (SEQ ID NO: 115) 
N, R and Y are as defined above. W is A or T. The BC primer (Primer 3) 
was based on the wheat cDNA sequence obtained as described above. The 
MKM primer (primer 4) was first checked by determining whether it would 
amplify the fabE gene coding BCCP from Anabaena DNA. This PCR was 
primed at the other end by using a primer based on the N-terminal amino 
acid residue sequence as determined on protein purified from Anabaena 
extracts by affinity chromatography. Those primers are shown below and in 
Figure 7. 

Primer 5 

5» GCTCTAGAYTTYAAYGARATHMG 3' (SEQ ID 

NO: 116) 

Primer 4 

5' TCGAATTCWNCATYTTCATNRC 3* (SEQ ID NO: 115) 
H, N, R, T, Y and W are as defined above. M is A or C. This 
amplification (using the conditions described above) yielded the correct 
fragment of the Anabaena fabE gene, which was used to identify cosmids 
that contained the entire fabE gene and flanking DNA. An' about 4 kb Xbal 
fragment containing the gene was cloned into the vector Bluescript KS for 
sequencing. 

Primers 3 and 4 were then used to amplify the intervening 
sequence in wheat cDNA. Again, the product of the first PCR was eiuted 
and reamplified by another round of PCR, then cloned into the Invitrogen 
vector pCRII. 

The complete 1.1 kb of the amplified DNA was sequenced, 
shown in Figure 6, nucleotides 376-1473. The nucleotide sequence of the 
BC domain is also shown in Figure 6, nucleotides 1-422. Three clones of 
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the BC domain gave the sequence shown. Four clones of the 1.1 -kb 
fragment differed at several positions, corresponding to three closely related 
sequences, all of which are indicated in the Figure. Most of the sequence 
differences are in the third codon position and are silent in terms of the 
5 amino acid sequence. 

The amino acid sequence of the polypeptide predicted from the 
cDNA sequence for this entire fragment of wheat cDNA (1473 nucleotides) 
is compared with the amino acid sequences of other ACC enzymes and 
related enzymes from various sources in Figure 3. The most significant 

10 identities are with the ACC of rat, chicken and yeast, as shown in the table 
below. Less extensive similarities are evident with the BC subunits of 
bacteria and the BC domains of other enzymes such as pyruvate carboxylase 
of yeast and propionyl CoA carboxylase of rat. The amino acid identities 
between wheat ACC and other biotin-dependent enzymes, within the BC 

15 domain (amino acid residues 312-630 in Figure 3) are shown below in Table 
1. 
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Table 1 

% identity 
with wheat ACC 



rat ACC 


58 


chicken ACC 


57 


yeast ACC 


56 


Synechococcus ACC 


32 


Anabaena ACC 


30 


E. coli ACC 


33 


rat propionyl CoA 


32 



carboxylase 
yeast pyruvate carboxylase 



31 



% identity 
with rat ACC 

(100) 



31 



C. Probes and Primers 

In another aspect, DNA sequence information provided by the 
invention allows for the preparation of relatively short DNA (or RNA) 

20 sequences having the ability to specifically hybridize to gene sequences of 

the selected polynucleotides disclosed herein. In these aspects, nucleic acid 
probes of an appropriate length are prepared based on a consideration of a 
selected ACC gene sequence, e.g., a sequence such as that shown in Figures 
1, 2, 6 or 8. The ability of such nucleic acid probes to specifically hybridize 

25 to an ACC gene sequence lend them particular utility in a variety of 

embodiments. Most importantly, the probes can be used in a variety of 
assays for detecting the presence of complementary sequences in a given 
sample. 

In certain embodiments, it is advantageous to use 
30 oligonucleotide primers. The sequence of such primers is designed using a 

polynucleotide of the present invention for use in detecting, amplifying or 
mutating a defined segment of an ACC gene from a cyanobacterium or a 
plant using PCR technology. Segments of ACC genes from other organisms 
can also be amplified by PCR using such primers. 
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To provide certain of the advantages in accordance with the 
present invention, a preferred nucleic acid sequence employed for 
hybridization studies or assays includes sequences that are complementary to 
at least a 10 to 30 or so long nucleotide stretch of an ACC sequence, such as 
5 that shown in Figures 1, 2, 6 or 8. A size of at least 10 nucleotides in 

length helps to ensure that the fragment will be of sufficient length to form a 
duplex molecule that is both stable and selective. Molecules having 
complementary sequences over stretches greater than 10 bases in length are 
generally preferred, though, in order to increase stability and selectivity of 

10 the hybrid, and thereby improve the quality and degree of specific hybrid 
molecules obtained. One will generally prefer to design nucleic acid 
molecules having gene-complementary stretches of 15 to 20 nucleotides, or 
even longer where desired. Such fragments may be readily prepared by, for 
example, directly synthesizing the fragment by chemical means, by 

15 application of nucleic acid reproduction technology, such as the PCR 

technology of U.S. Patent 4,603,102, herein incorporated by reference, or 
by excising selected DNA fragments from recombinant plasmids containing 
appropriate inserts and suitable restriction sites. 

Accordingly, a nucleotide sequence of the invention can be 

20 used for its ability to selectively form duplex molecules with complementary 

stretches of the gene. Depending on the application envisioned, one will 
desire to employ varying conditions of hybridization to achieve varying 
degree of selectivity of the probe toward the target sequence. For 
applications requiring a high degree of selectivity, one will typically desire 

25 to employ relatively stringent conditions to form the hybrids, for example, 

one will select relatively low salt and\or high temperature conditions, such as 
provided by 0.02M-0.15M NaCl at temperatures of 50°C to 70°C. These 
conditions are particularly selective, and tolerate little, if any, mismatch 
between the probe and the template or target strand. 

30 Of course, for some applications, for example, where one 

desires to prepare mutants employing a mutant primer strand hybridized to 
an underlying template or where one seeks to isolate an ACC coding 
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sequences for related species, functional equivalents, or the like, less 
stringent hybridization conditions will typically be needed in order to allow 
formation of the heteroduplex. In these circumstances, one may desire to 
employ conditions such as 0.15M-0.9M salt, at temperatures ranging from 
20°C to 55 °C. Cross-hybridizing species can thereby be readily identified 
as positively hybridizing signals with respect to control hybridizations. In 
any case, it is generally appreciated that conditions can be rendered more 
stringent by the addition of increasing amounts of formamide, which serves 
to destabilize the hybrid duplex in the same manner as increased 
temperature. Thus, hybridization conditions can be readily manipulated, and 
thus will generally be a method of choice depending on the desired results. 

In certain embodiments, it is advantageous to employ a 
polynucleotide of the present invention in combination with an appropriate 
label for detecting hybrid formation. A wide variety of appropriate labels 
are known in the art, including radioactive, enzymatic or other ligands, such 
as avidin/biotin, which are capable of giving a detectable signal. 

In general, it is envisioned that a hybridization probe 
described herein is useful both as a reagent in solution hybridization as well 
as in embodiments employing a solid phase. In embodiments involving a 
solid phase, the test DNA (or RNA) is adsorbed or otherwise affixed to a 
selected matrix or surface. This fixed nucleic acid is then subjected to 
specific hybridization with selected probes under desired conditions. The 
selected conditions depend as is well known in the art on the particular 
circumstances and criteria required (e.g., on the G+C contents, type of 
target nucleic acid, source of nucleic acid, size of hybridization probe). 
Following washing of the matrix to remove nonspecifically bound probe 
molecules, specific hybridization is detected, or even quantified, by means of 
the label. 

D. Expression Vector 

The present invention contemplates an expression vector 
comprising a polynucleotide of the present invention, Thus, in one 
embodiment an expression vector is an isolated and purified DNA molecule 
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comprising a promoter operatively linked to an coding region that encodes a 
polypeptide having the ability to catalyze the carboxylation of a biotin 
carboxyl carrier protein of a cyanobacterium, which coding region is 
operatively linked to a transcription-terminating region, whereby the 
5 promoter drives the transcription of the coding region. 

As used herein, the term "operatively linked" means that a 
promoter is connected to an coding region in such a way that the 
transcription of that coding region is controlled and regulated by that 
promoter. Means for operatively linking a promoter to a coding region are 

10 well known in the art. 

Where an expression vector of the present invention is to be 
used to transform a cyanobacterium, a promoter is selected that has the 
ability to drive and regulate expression in cyanobacteria. Promoters that 
function in bacteria are well known in the art. An exemplary and preferred 

15 promoter for the cyanobacterium Anabaena is the glnA gene promoter. An 
exemplary and preferred promoter for the cyanobacterium Synechococcus is 
the psbAI gene promoter. Alternatively, the cyanobacterial fabG gene 
promoters themselves can be used. 

Where an expression vector of the present invention is to be 

20 used to transform a plant, a promoter is selected that has the ability to drive 

expression in plants. Promoters that function in plants are also well known 
in the art. Useful in expressing the polypeptide in plants are promoters that 
are inducible, viral, synthetic, constitutive as described by Poszkowski et al., 
EMBO J. . 2:2719 (1989) and Odell et al., Nature . 112:810 (1985), and 

25 temporally regulated, spatially regulated, and spatiotemporally regulated as 

given in Chau et al., Science . 244:174-181 (1989). 

A promoter is also selected for its ability to direct the 
transformed plant cell's or transgenic plant's transcriptional activity to the 
coding region. Structural genes can be driven by a variety of promoters in 

30 plant tissues. Promoters can be near-constitutive, such as the CaMV 35S 
promoter, or tissue specific or developmentally specific promoters affecting 
dicots or monocots. 
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Where the promoter is a near-constitutive promoter such as 
CaMV 35S . increases in polypeptide expression are found in a variety of 
transformed plant tissues (e.g. callus, leaf, seed and root). Alternatively, the 
effects of transformation can be directed to specific plant tissues by using 
5 plant integrating vectors containing a tissue-specific promoter. 

An exemplary tissue-specific promoter is the Lectin promoter, 
which is specific for seed tissue. The Lectin protein in soybean seeds is 
encoded by a single gene (Lei) that is only expressed during seed maturation 
and accounts for about 2 to about 5 percent of total seed mRNA. The Lectin 
10 gene and seed-specific promoter have been fully characterized and used to 
direct seed specific expression in transgenic tobacco plants. See, e.g. . 
Vodkin.et.al., £eU, 34:1023 (1983) and Lindstrom et al., Developmental 
Genetics , U: 160 (1990). 

An expression vector containing a coding region that encodes 
15 a polypeptide of interest is engineered to be under control of the Lectin 

promoter and that vector is introduced into plants using, for example, a 
protoplast transformation method. Dhir et al., Plant Cell Reports . 10:97 
(1991). The expression of the polypeptide is directed specifically to the 
seeds of the transgenic plant. 
20 A transgenic plant of the present invention produced from a 

plant cell transformed with a tissue specific promoter can be crossed with a 
second transgenic plant developed from a plant cell transformed with a 
different tissue specific promoter to produce a hybrid transgenic plant that 
shows the effects of transformation in more than one specific tissue. 

25 

Exemplary tissue-specific promoters are com sucrose 
synthetase 1 (Yang et al. Proc. Natl. Acad t Sci. U.S.A. , 87:4144-48 
(1990)), corn alcohol dehydrogenase 1 (Vogel et al., J. Cell Biochem. . 
(supplement 13D, 312) (1989)), corn zein 19KD gene (storage protein) 
30 (Boston et al., Plant Phvsiol. . £2:742-46), corn light harvesting complex 

(Simpson, Science . 222:34 (1986), corn heat shock protein (O'Dell et al., 
Nature . 313:810-12 (1985), pea small subunit RuBP Carboxylase (Poulsen et 



-23- 

al., Mol. Gen. Genet. . 205:193-200 (1986); Cashmore et al., Gen. Eng. of 
Plants . Plenum Press, New York, 29-38 (1983), Ti plasmid mannopine 
synthase (Langridge et al., Proc. Natl. Acad. Sci. USA. £6:3219-3223 
(1989), Ti plasmid nopaline synthase (Langridge et al., Proc. Natl. Acad. 
Sci. USA . g^:3219-3223 (1989), petunia chalcone isomerase (Van Tunen et 
al., EMBO J. . 7:1257 (1988), bean glycine rich protein 1 (Keller et al., 
EMBO J. . 8:1309-14 (1989), CaMV 35s transcript (O'Deli et al., Nature . 
313:810-12 (1985) and Potato patatin (Wenzler et al., Plant Mol. Biol. . 
12:41-50 (1989). Preferred promoters are the cauliflower mosaic virus 
(CaMV 35S) promoter and the S-E9 small subunit RuBP carboxylase 
promoter. 

The choice of which expression vector and ultimately to which 
promoter a polypeptide coding region is operatively linked depends directly 
on the functional properties desired, e.g. the location and timing of protein 
expression, and the host cell to be transformed. These are well known 
limitations inherent in the art of constructing recombinant DNA molecules. 
However, a vector useful in practicing the present invention is capable of 
directing the expression of the polypeptide coding region to which it is 
operatively linked. 

Typical vectors useful for expression of genes in higher plants 
are well known in the art and include vectors derived from the tumor- 
inducing (Ti) plasmid of Agrobacterium tumefaciens described by Rogers et 
al., Meth. in Enzvmol. . 153:253-277 (1987). However, several other plant 
integrating vector systems are known to function in plants including 
pCaMVCN transfer control vector described by Fromm et al., Proc. Natl. 
Acad. Sci. USA . 82:5824 (1985). Plasmid pCaMVCN (available from 
Pharmacia, Piscataway, NJ) includes the cauliflower mosaic vims CaMV 
35S promoter. 

In preferred embodiments, the vector used to express the 
polypeptide includes a selection marker that is effective in a plant cell, 
preferably a drug resistance selection marker. One preferred drug resistance 
marker is the gene whose expression results in kanamycin resistance; i.e., 
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the chimeric gene containing the nopaline synthase promoter, Tn5 neomycin 
phosphotransferase II and nopaline synthase 3' nontranslated region 
described by Rogers et al., in Methods For Plant Molecular Biology, A. 
Weissbach and H. Weissbach, eds., Academic Press Inc., San Diego, CA 
(1988). 

RNA polymerase transcribes a coding DNA sequence through 
a site where polyadenylation occurs. Typically, DNA sequences located a 
few hundred base pairs downstream of the polyadenylation site serve to 
terminate transcription. Those DNA sequences are referred to herein as 
transcription-termination regions. Those regions are required for efficient 
polyadenylation of transcribed messenger RNA (mRNA). 

Means for -preparing expression vectors are well known in the 
art. Expression (transformation vectors) used to transform plants and 
methods of making those vectors are described in United States Patent Nos. 
4,971,908, 4,940,8354,769,061 and 4,757,011, the disclosures of which are 
incorporated herein by reference. Those vectors can be modified to include 
a coding sequence in accordance with the present invention. 

A variety of methods has been developed to operatively link 
DNA to vectors via complementary cohesive termini or blunt ends. For 
instance, complementary homopolymer tracts can be added to the DNA 
segment to be inserted and to the vector DNA. The vector and DNA 
segment are then joined by hydrogen bonding between the complementary 
homopolymeric tails to form recombinant DNA molecules. 

A coding region that encodes a polypeptide having the ability 
to catalyze the carboxylation of a biotin carboxyl carrier protein of a 
cyanobacterium is preferably a biotin carboxylase enzyme of a 
cyanobacterium, which enzyme is a subunit of acetyl-CoA carboxylase and 
participates in the carboxylation of acetyl-CoA. In a preferred embodiment, 
such a polypeptide has the amino acid residue sequence of Figure 1 or 
Figure 2, or a functional equivalent of those sequences. In accordance with 
such an enbodiment, a coding region comprises the entire DNA sequence of 
SEQ ID NO:l (Figure 1) or the DNA sequence of SEQ ID NO:l (Figure 1) 
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from about nucleotide position 1300 to about nucleotide position 2650 or the 
DNA sequence of SEQ ID NO:5 (Figure 2). 

In another embodiment, an expression vector comprises a 
coding region of from about 480 to about 40,000 base pairs that encodes a 
5 biotin carboxyl carrier protein. of a cyanobacterium. That biotin carboxyl 

carrier protein preferably includes the amino acid residue sequence of Figure 
8 (SEQ ID NO: 111) or a functional equivalent thereof. A preferred such 
coding region includes the DNA sequence of SEQ ID NO: 1 10 (Figure 8). 

In still yet another embodiment, an expression vector 
10 comprises a coding region that encodes a plant polypeptide having the ability 
to catalyze the carboxylauon of acetyl-CoA. Such a plant polypeptide is 
preferably a monocotyledonous or a dicotyledonous plant acetyl-CoA 
carboxylase enzyme. 

A preferred monocotyledonous plant polypeptide encoded by 
15 such a coding region is preferably wheat ACC, which ACC includes the 

amino acid residue sequence of SEQ ID NO: 109 (Figure 6) or a functional 
equivalent thereof. A preferred coding region includes the DNA sequence of 
SEQ ID NO: 108 (Figure 6). 

20 IIL Polypeptide 

The present invention contemplates a polypeptide that defines 
a whole or a portion of an ACC of a cyanobacterium or a plant. In one 
embodiment, thus, the present invention provides an isolated polypeptide 
having the ability to catalyze the carboxylation of a biotin carboxyl carrier 

25 protein of a cyanobacterium such as Anabaena or Synechococcus. 

Preferably, a biotin carboxyl carrier protein includes the amino acid 
sequence of SEQ ID NO:lll and the polypeptide has Figure 1 or Figure 2. 

The present invention also contemplates an isolated and 
purified biotin carboxyl carrier protein of a cyanobacterium such as 

30 Anabaena, which protein includes the amino acid residue sequence of SEQ 
ID NO:lll. 
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In another embodiment, the present invention contemplates an 
isolated and purified plant polypeptide having a molecular weight of about 
220 KD, dimers of which have the ability to catalyze the carboxylation of 
acetyl-CoA. Such a polypeptide preferably includes the amino acid residue 
sequence of SEQ ID NO: 109. 

Modification and changes may be made in the structure of 
polypeptides of the present invention and still obtain a molecule having like 
or otherwise desirable characteristics. For example, certain amino acids may 
be substituted for other amino acids in a protein structure without 
appreciable loss of interactive binding capacity with structures such as, for 
example, antigen-binding regions of antibodies or binding sites on substrate 
molecules. Since it is the interactive capacity and nature of a polypeptide 
that defines that polypeptide's biological functional activity, certain amino 
acid sequence substitutions can be made in a polypeptide sequence (or, of 
course, its underlying DNA coding sequence) and nevertheless obtain a 
polypeptide with like or even counterveiling properties (e.g., antagonistic v. 
agonistic). 

In making such changes, the hydropathic index of amino acids 
may be considered. The importance of the hydropathic amino acid index in 
conferring interactive biologic function on a protein is generally understood 
in the art (Kyte & Doolittle, J. Mol. Biol., 157:105-132, 1982). It is known 
that certain amino acids may be substituted for other amino acids having a 
similar hydropathic index or score and still result in a protein with similar 
biological activity. Each amino acid has been assigned a hydropathic index 
on the basis of their hydrophobicity and charge characteristsics, these are: 
isoleucine (+4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8); 
cysteine/cystine (+2.5); methionine (+1.9); alanine (+1.8); glycine (-0.4); 
threonine (-0.7); serine (-0.8); tryptophan (-0.9); tyrosine (-1.3); proline 
(-1.6); histidine (-3.2); glutamate (-3.5); glutamine (-3.5); aspartate (-3.5); 
asparagine (-3.5); lysine (-3.9); and arginine (-4.5). 

It is believed that the relative hydropathic character of the 
amino acid determines the secondary structure of the resultant polypeptide, 



-27- 

which in turn defines the interaction of the polypeptide with other molecules, 
for example, enzymes, substrates, receptors, antibodies, antigens, and the 
like. It is known in the art that an amino acid may be substituted by another 
amino acid having a similar hydropathic index and still obtain a biological 
functionally equivalent protein. In such changes, the substitution of amino 
acids whose hydropathic indices are within ±2 is preferred, those which are 
within ±1 are particularly preferred, and those within ±0.5 are even more 
particularly preferred. 

Substitution of like amino acids can also be made on the basis 
of hydrophilicity, particularly where the biological functional equivalent 
protein or peptide thereby created is intended for use in immunological 
embodiments. U.S. Patent 4,554,101, incorporated herein by reference, 
states that the greatest local average hydrophilicity of a protein, as governed 
by the hydrophilicity of its adjacent amino acids, correlates with its 
immunogenicity and antigenicity, i.e. with a biological property of the 
protein. 

As detailed in U.S. Patent 4,554,101, the following 
hydrophilicity values have been asssigned to amino acid residues: arginine 
(+3.0); lysine (+3.0); aspartate (+3.0 ± 1); glutamate (+3.0 ± 1); serine 
(+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); proline (-0.5 
+ 1); threonine (-0.4); alanine (-0.5); histidine (-0.5); cysteine (-1.0); 
methionine (-1.3); valine (-1.5); leucine (-1.8); isoleucine (-1.8); tyrosine 
(-2.3); phenylalanine (-2.5); tryptophan (-3.4). It is understood that an 
amino acid can be substituted for another having a similar hydrophilicity 
value and still obtain a biologically equivalent, and in particular, an 
immunologically equivalent protein. In such changes, the substitution of 
amino acids whose hydrophilicity values are within ±2 is preferred, those 
which are within ± 1 are particularly preferred, and those within ±0.5 are 
even more particularly preferred. 

As outlined above, amino acid substitutions are generally 
therefore based on the relative similarity of the amino acid side-chain 
substituents, for example, their hydrophobicity, hydrophilicity, charge, size, 
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and the like. Exemplary substitutions which take various of the foregoing 
characteristics into consideration are well known to those of skill in the art 
and include: arginine and lysine; glutamate and aspartate; serine and 
threonine; glutamine and asparagine; and valine, leucine and isoleucine. 

The present invention thus contemplates functional equivalents 
of the polypeptides set forth above. A polypeptide of the present invention 
is prepared by standard techniques well known to those skilled in the art. 
Such techniques include, but are not limited to, isolation and purification 
from tissues known to contain that polypeptide and expression from cloned 
DNA using transformed cells. 

, IV^ Transformed or transg enic cells or plants 

A cyanobacterium, a plant cell or a plant transformed with an 
expression vector of the present invention is also contemplated. A 
transgenic cyanobacterium, plant cell or plant derived from such a 
transformed or transgenic cell is also contemplated. 

Means for transforming cyanobacteria are well known in the 
art. ' Typically, means of transformation are similar to those well known 
means used to transform other bacteria such as E. coli. Synechococcus can 
be transformed simply by incubation of log-phase cells with DNA. (Golden, 
etal., 1987) 

The application of brief, high-voltage electric pulses to a 
variety of mammalian and plant cells leads to the formation of nanometer- 
sized pores in the plasma membrane. DNA is taken directly into the cell 
cytoplasm either through these pores or as a consequence of the 
redistribution of membrane components that accompanies closure of the 
pores. Electroporation can be extremely efficient and can be used both for 
transient expression of clones genes and for establishment of cell lines that 
carry integrated copies of the gene of interest. Electroporation, in contrast 
to calcium phosphate-mediated transfection and protoplast fusion, frequently 
gives rise to cell lines that carry one, or at most a few, integrated copies of 
the foreign DNA. 
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Methods for DNA transformation of plant cells include 
Agrobacterium- mediated plant transformation, protoplast transformation, 
gene transfer into pollen, injection into reproductive organs, injection into 
immature embryos and particle bombardment. Each of these methods has 
distinct advantages and disadvantages. Thus, one particular method of 
introducing genes into a particular plant strain may not necessarily be the 
most effective for another plant strain,' but it is well known which methods 
are useful for a particular plant strain. 

Agrobacterium-medisted transfer is a widely applicable system 
for introducing genes into plant cells because the DNA can be introduced 
into whole plant tissues, thereby bypassing the need for regeneration of an 
intact plant from a protoplast. The use of Agrobacterium-mediated plant 
integrating vectors to introduce DNA into plant cells is well known in the 
art. See, for example, the methods described by Fraley et al., 
Biotechnology . 2:629 (1985) and Rogers et al., Methods in jEnzymplpgy, 
152:253-277 (1987). Further, the integration of the Ti-DNA is a relatively 
precise process resulting in few rearrangements. The region of DNA to be 
transferred is defined by the border sequences, and intervening DNA is 
usually inserted into the plant genome as described by Spielmann et al., Mol. 
Gen. Genet. . 2Q£:34 (1986) and Jorgensen et al., MqL Qen. Qenet,, 
207:471 (1987). 

Modern Agrobacterium transformation vectors are capable of 
replication in E. coli as well as Agrobacterium, allowing for convenient 
manipulations as described by Klee et al., in Plant DNA Infectious Agents . 
T. Hohn and J. Schell, eds., Springer- Verlag, New York (1985) pp. 179- 
203. 

Moreover, recent technological advances in vectors for 
Agrobacterium-medizted gene transfer have improved the arrangement of 
genes and restriction sites in the vector s ; to facilitate construction of vectors 
capable of expressing various polypeptide coding genes. The vectors 
described by Rogers et al., Methods in Enzymologv . 152:253 (1987), have 
convenient multi-linker regions flanked by a promoter and a polyadenylation 
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site for direct expression of inserted polypeptide coding genes and are 
suitable for present purposes. In addition, Agrobacteria containing both 
armed and disarmed Ti genes can be used for the transformations. In those 
plant strains where Agrobacterium-mediated transformation is efficient, it is 
5 the method of choice because of the facile and defined nature of the gene 
transfer. 

Agrobacterium-mediaxed transformation of leaf disks and other 
tissues such as cotyledons and hypocotyls appears to be limited to plants that 
Agrobacterium naturally infects. Agrobacterium-mediated transformation is 

10 most efficient in dicotyledonous plants. Few monocots appear to be natural 

hosts for Agrobacterium, although transgenic plants have been produced in 
asparagus using Agrobacterium vectors as described by Bytebier et al., Froc. 
Natl. Acad. Sci. USA . 84*5345 (1987). Therefore, commercially important 
cereal grains such as rice, corn, and wheat must usually be transformed 

15 using alternative methods. However, as mentioned above, the 

transformation of asparagus using Agrobacterium can also be achieved. See, 
for example, Bytebier, et al., Proc. Natl. Ac ad. Sci. USA. 84:5345 ( i 9 87). 

A transgenic plant formed using Agrobacterium transformation 
methods typically contains a single gene on one chromosome. Such 

20 transgenic plants can be referred to as being heterozygous for the added 

gene. However, inasmuch as use of the word "heterozygous'* usually 
implies the presence of a complementary gene at the same locus of the 
second chromosome of a pair of chromosomes, and there is no such gene in 
a plant containing one added gene as here, it is believed that a more accurate 

25 name for such a plant is an independent segregant, because the added, 

exogenous gene segregates independently during mitosis and meiosis. 

More preferred is a transgenic plant that is homozygous for 
the added structural gene; i.e., a transgenic plant that contains two added 
genes, one gene at the same locus on each chromosome of a chromosome 

30 pair. A homozygous transgenic plant can be obtained by sexually mating 
(selfing) an independent segregant transgenic plant that contains a single 
added gene, germinating some of the seed produced and analyzing the 
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resulting plants produced for enhanced carboxylase activity relative to a 
control (native, non-transgenic) or an independent segregant transgenic plant. 

It is to be understood that two different transgenic plants can 
also be mated to produce offspring that contain two independently 
5 segregating added, exogenous genes. Selfing of appropriate progeny can 
produce plants that are homozygous for both added, exogenous genes that 
encode a polypeptide of interest. Back-crossing to a parental plant and out- 
crossing with a non-transgenic plant are also contemplated. 

Transformation of plant protoplasts can be achieved using 
10 methods based on calcium phosphate precipitation, polyethylene glycol 

treatment, electroporation, and combinations of these treatments. See, for 
example, Potrykus et al., Mol. Gen. Genet. . 122:183 (1985); Lorz et al., 
Mol. Gen. Genet. . 122:178 (1985); Fromm et al., Nature . 212:791 (1986); 
Uchimiya et al., Mol, Qen t Qenet,, 204:204 (1986); Callis et al., Genes and 
15 Development . 1:1183 (1987); and Marcotte et al., Nat ure, 225:454 (1988). 

Application of these systems to different plant strains depends 
upon the ability to regenerate that particular plant strain from protoplasts. 
Illustrative methods for the regeneration of cereals from protoplasts are 
described in Fujimura et al., Plant Tissue Culture Letters . 2:74 (1985); 
20 Toriyama et al., Theor Appl. Genet. . 23: 16 (1986); Yamada et al., Plant 
Cell Rep. . 4:85 (1986); Abdullah et al., Biotechnology . 4:1087 (1986). 

To transform plant strains that cannot be successfully 
regenerated from protoplasts, other ways to introduce DNA into intact cells 
or tissues can be utilized. For example, regeneration of cereals from 
25 immature embryos or explants can be effected as described by Vasil, 

Biotechnology . 6:397 (1988). In addition, "particle gun" or high- velocity 
microprojectile technology can be utilized. (Vasil, 1992) 

Using that latter technology, DNA is carried through the cell 
wall and into the cytoplasm on the surface of small metal particles as 
30 described in Klein et al., Nature . 227:70 (1987); Klein et al., Proc. Natl. 

Acad. Sci. U.S.A .. £5:8502 (1988); and McCabe et al., Biotechnology . 
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6:923 (1988). The metal particles penetrate through several layers of cells 
and thus allow the transformation of cells within tissue explants. 

Metal particles have been used to successfully transform corn 
cells and to produce fertile, stable transgenic tobacco plants as described by 
5 Gordon-Kamm, W.J. et al., The Plant Cell . 2:603-618 (1990); Klein, T.M. 

et al., Plant Phvsiol. . 21:440-444 (1989); Klein, T.M. et al., Proc. Natl. 
Acad. Sci. USA . 85:8502-8505 (1988); and Tomes, D.T. et al., Plant MQl 
Biol. . 14:261-268 (1990). Transformation of tissue explants eliminates the 
need for passage through a protoplast stage and thus speeds the production of 
10 transgenic plants. 

Thus, the amount of a gene coding for a polypeptide of 
interest (i.e., a polypeptide having carboxylation activity) can be increased in 
monocotyledonous plants such as com by transforming those plants using 
particle bombardment methods. Maddock et al., Third International 
15 Congress of Plant Molecular Biology. Abstract 372 (1991). By way of 

example, an expression vector containing an coding region for a 
dicotyledonous ACC and an appropriate selectable marker is transformed 
into a suspension of embryonic maize (corn) cells using a particle gun to 
deliver the DNA coated on microprojectiles. Transgenic plants are 
20 regenerated from transformed embryonic calli that express ACC. Particle 
bombardment has been used to successfully transform wheat (Vasil et al., 
1992). 

DNA can also be introduced into plants by direct DNA 
transfer into pollen as described by Zhou et al., Methods in Enzvmologv . 
25 101:433 (1983); D. Hess, Intern Rev. Cvtol. . Ifi7:367 (1987); Luo et al., 

Plant MoL Biol. Reporter . £:165 (1988). Expression of polypeptide coding 
genes can be obtained by injection of the DNA into reproductive organs of a 
plant as described by Pena et al., Nature . 3.25:274 (1987). DNA can also be 

injected directly into the cells of immature embryos and the rehydration of 

* 

30 desiccated embryos as described by Neuhaus et al., Theor. Ap pl. Genet. . 

75:30 (1987); and Benbrook et al., in Proceedings Bio Expo 1986. 
Butterworth, Stoneham, MA, pp. 27-54 (1986). 
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The development or regeneration of plants from either single 
plant protoplasts or various explants is well known in the art. See, for 
example, Methods for Plant Molecular Biology . A. Weissbach and H. 
Weissbach, eds., Academic Press, Inc., San Diego, CA (1988). This 
regeneration and growth process typically includes the steps of selection of 
transformed cells, cultuhng those individualized cells through the usual 
stages of embryonic development through the rooted plantlet stage. 
Transgenic embryos and seeds are similarly regenerated. The resulting 
transgenic rooted shoots are thereafter planted in an appropriate plant growth 
medium such as soil. 

The development or regeneration of plants containing the 
foreign, exogenous gene that encodes a polypeptide of interest introduced by 
Agrobacteriwn from leaf explants can be achieved by methods well known in 
the art such as described by Horsch et al., Science . 227:1229-1231 (1985). 
In this procedure, transform ants are cultured in the presence of a selection 
agent and in a medium that induces the regeneration of shoots in the plant 
strain being transformed as described by Fraley et al., Proc. Natl. Acad. 
SgL U.S.A., 80:4803 (1983). 

This procedure typically produces shoots within two to four 
months and those shoots are then transferred to an appropriate root-inducing 
medium containing the selective agent and an antibiotic to prevent bacterial 
growth. Shoots that rooted in the presence of the selective agent to form 
plantlets are then transplanted to soil or other media to allow the production 
of roots. These procedures vary depending upon the particular plant strain 
employed, such variations being well known in the art. 

Preferably, the regenerated plants are self-pollinated to 
provide homozygous transgenic plants, as discussed before. Otherwise, 
pollen obtained from the regenerated plants is crossed to seed-grown plants 
of agronomically important, preferably inbred lines? r Conversely, pollen 
from plants of those important lines is used to pollinate regenerated plants. 

A transgenic plant of the present invention containing a 
desired polypeptide is cultivated using methods well known to one skilled in 



-34- 

the art. Any of the transgenic plants of the present invention can be 
cultivated to isolate the desired ACC or fatty acids which are the products of 
the series of reactions of which that catalyzed by ACC is the first. 

A transgenic plant of this invention thus has an increased 
amount of an coding region (e.g. gene) that encodes a polypeptide of 
interest. A preferred transgenic plant is an independent segregant and can 
transmit that gene and its activity to its progeny. A more preferred 
transgenic plant is homozygous for that gene, and transmits that gene to all 
of its offspring on sexual mating. 

Seed from a transgenic plant is grown in the field or 
greenhouse, and resulting sexually mature transgenic plants are self- 
pollinated to generate true breeding plants. The progeny from these plants 
become true breeding lines that are evaluated for, by way of example, 
herbicide resistance, preferably in the field, under a range of environmental 
conditions. 

The commercial value of a transgenic plant with increased 
herbicide resistance or with altered fatty acid production is enhanced if many 
different hybrid combinations are available for sale. The user typically 
grows more than one kind of hybrid based on such differences as time to 
maturity, standability or other agronomic traits. Additionally, hybrids 
adapted to one part of a country are not necessarily adapted to another part 
because of differences in such traits as maturity, disease and herbicide 
resistance. Because of this, herbicide resistance is preferably bred into a 
large number of parental lines so that many hybrid combinations can be 
produced. 

V. Process of increasing herbicide resistance 
Herbicides such as aryloxyphenoxypropionates and 
cyclohexanediones inhibit the growth of monocotyledonous weeds by 
interfering with fatty acid biosynthesis of herbicide sensitive plants. ACC is 
the target enzyme for those herbicides. Dicotyledonous plants, other 
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eukaryotic organisms and prokaryotic organisms are resistant to those 
compounds. 

Thus, the resistance of sensitive monocotyledonous plants to 
herbicides can be increased by providing those plants with ACC that is not 
sensitive to herbicide inhibition. The present invention therefore provides a 
process of increasing the herbicide resistance of a monocotyledonous plant 
comprising transforming the plant with a DNA molecule comprising a 
promoter operatively linked to a coding region that encodes a herbicide 
resistant polypeptide having the ability to catalyze the carboxylation of 
acetyl-CoA, which coding region is operatively linked to a transcription- 
terminating region, whereby the promoter is capable of driving the 
transcription of the coding region in a monocotyledonous plant. 

Preferably, a herbicide resistant polypeptide, a dicotyledonous 
plant polypeptide such as an acetyl-CoA carboxylase enzyme from soybean, 
rape, sunflower, tobacco, Arabidopsis, petunia, Canola, pea, bean, tomato, 
potato, lettuce, spinach, alfalfa, cotton or carrot, or functional equivalent 
thereof. A promoter and a transcription-terminating region are preferably 
the same as set forth above. 

Transformed monocotyledonous plants can be identified using 
herbicide resistance. A process for identifying a transformed 
monocotyledonous plant cell comprises the steps of: 

(a) transforming the monocotyledonous plant cell with a 
DNA molecule that encodes a dicotyledonous acetyl-CoA carboxylase 
enzyme; and 

(b) determining the resistance of the plant cell to a 
herbicide and thereby the identification of the transformed monocotyledonous 
plant cell. 

Means for transforming a monocotyledonous plant cell are the 
same as set forth above. ; 

The resistance of a transformed plant cell to a herbicide is 
preferably determined by exposing such a cell to an effective herbicidal dose 
of a preselected herbicide and maintaining that cell for a period of time and 
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under culture conditions sufficient for the herbicide to inhibit ACC, alter 
fatty acid biosynthesis or retard growth. The effects of the herbicide can be 
studied by measuring plant cell ACC activity, fatty acid synthesis or growth. 
An effective herbicidal dose of a given herbicide is that 
5 amount of the herbicide that retards growth or kills plant cells not containing 

herbicide-resistant ACC or that amount of a herbicide known to inhibit plant 
growth. Means for determining an effective herbicidal dose of a given 
herbicide are well known in the art. Preferably, a herbicide used in such a 
process is an aryloxyphenoxypropionate or cyclohexanedione herbicide. 

10 

VI. Process of altering ACC activity 

Acetyl-CoA carboxyase catalyzes the carboxylation of acetyl- 
CoA. Thus, the carboxylation of acetyl-CoA in a cyanobacterium or a plant 
can be altered by, for example, increasing an ACC gene copy number or 
15 changing the composition (e.g., nucleotide sequence) of an ACC gene. 

Changes in ACC gene composition can alter gene expression at either the 
transcriptional or translational level. Alternatively, changes in gene 
composition can alter ACC function (e.g., activity, binding) by changing 
primary, secondary or tertiary structure of the enzyme. By way of example, 
20 certain changes in ACC structure are associated with changes in the 

resistance of that altered ACC to herbicides. The copy number of such a 
gene can be increased by transforming a cyanobacterium or a plant cell with 
an appropriate expression vector comprising a DNA molecule that encodes 
ACC. 

25 In one embodiment, therefore, the present invention 

contemplates a process of altering the carboxylation of acetyl-CoA in a cell 
comprising transforming the cell with a DNA molecule comprising a 
promoter operatively linked to a coding region that encodes a polypeptide 
having the ability to catalyze the carboxylation of acetyl-CoA, which coding 

30 region is operatively linked to a transcription -terminating region, whereby 
the promoter is capable of driving the transcription of the coding region in 
the cyanobacterium. 
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In a preferred embodiment, a cell is a cyanobacterium or a 
plant cell, a polypeptide is a cyanobacterial ACC or a plant ACC. 
Exemplary and preferred expression vectors for use in such a process are the 
same as set forth above. 

Where a cyanobacterium is transformed with a plant ACC 
DNA molecule, that cyanobacterium can be used to identify herbicide 
resistant mutations in the gene encoding ACC. In accordance with such a 
use, the present invention provides a process for identifying herbicide 
resistant variants of a plant acetyl-CoA carboxylase comprising the steps of: 

(a) .transforming cyanobacteria with a DNA molecule that 
encodes a monocotyledonous plant acetyl-CoA carboxylase enzyme to form 
transformed or transfected cyanobacteria; 

(b) inactivating cyanobacterial acetyl-CoA carboxylase; 

(c) exposing the transformed cyanobacteria to an effective 
herbicidal amount of a herbicide that inhibits acetyl-CoA carboxylase 
activity; 

(d) identifying transformed cyanobacteria that are resistant 
to the herbicide; and 

(e) characterizing DNA that encodes acetyl-CoA 
carboxylase from the cyanobacteria of step (d). 

Means for transforming cyanobacteria as well as expression 
vectors used for such transformation are preferably the same as set forth 
above. In a preferred embodiment, cyanobacteria are transformed or 
transfected with an expression vector comprising an coding region that 
encodes wheat ACC. 

Cyanobacteria resistant to the herbicide are identified. 
Identifying comprises growing or culturing transformed cells in the presence 
of the herbicide and recovering those cells that survive herbicide exposure. 

1 " Transformed, herbicide-resistant cells are then grown in 
culture, collected and total DNA extracted using standard techniques. ACC 
DNA is isolated, amplified if needed and then characterized by comparing 
that DNA with DNA from ACC known to be inhibited by that herbicide. 
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VII . Process for D etermining Herbicide Resistance 

Inheritibility 

In yet another aspect, the present invention provides a process 
for determining the inheritance of plant resistance to herbicides of the 
5 aryloxyphenocypropionate or cyclohexanedione class. That process 
comprises the steps of: 

(a) measuring resistance to herbicides of the 
aryloxyphenocypropionate or cyclohexanedione class in a parental plant line 
and in progeny of the parental plant line to; 
10 (b) purifying DNA from the parental plant line and the 

progeny; 

(c) digesting the DNA with restriction enzymes to form 
DNA fragments; 

(d) fractionating the fragments on a gel; 

15 (e) transferring the fragments to a filter support; 

(f) annealing the fragments with a labelled RFLP probe 
consisting of a DNA molecule that encodes acetyl-CoA carboxylase or a 
portion thereof; 

(g) detecting the presence of complexes between the 
20 fragments and the RFLP probe; and 

(h) correlating the herbicide resistance of step (a) with the 
complexes of step (g) and thereby the inheritance of herbicide resistance. 

In a preferred embodiment, the herbicide resistant variant of 
acetyl-CoA carboxylase is a dicotyledonous plant acetyl-CoA carboxylase 

25 enzyme or a portion thereof. In another preferred embodiment, the 

herbicide resistant variant of acetyl-CoA carboxylase is a mutated 
monocotyledonous plant acetyl-CoA carboxylase that confers herbicide 
resistance or a hybrid acetyl-CoA carboxylase comprising a portion of a 
dicotyledonous plant acetyl-CoA carboxylase, a portion of a dicotyledonous 

30 plant acetyl-CoA carboxylase or one or more domains of a cyanobacterial 
acetyl-CoA carboxylase. 
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The inheritability of phenotypic traits such as herbicide 
resistance can be determined using RFLP analysis. Restriction fragment 
length polymorphisms (RFLPs) are due to sequence differences detectable by 
lengths of DNA fragments generated by digestion with restriction enzymes 
5 and typically revealed by agarose gel electrophoresis. There are large 

numbers of restriction endonucleases available, , characterized by their 
recognition sequences and source. 

Restriction fragment length polymorphism analyses are 
conducted, for example, by Native Plants Incorporated (NPI). This service 
10 is available to the public on a contractual basis. For this analysis, the 

genetic marker profile of the parental inbred lines is determined. If parental 
lines are essentially homozygous at all relevant loci (i.e., they should have 
only one allele at each locus), the diploid genetic marker profile of the 
hybrid offspring of the inbred parents should be the sum of those parents, 
15 e.g., if one parent had the allele A at a particular locus, and the other parent 
had B, the hybrid AB is by inference. 

Probes capable of hybridizing to specific DNA segments under 
appropriate conditions are prepared using standard techniques well known to 
those skilled in the art. The probes are labelled with radioactive isotopes or 
20 fluorescent dyes for ease of detection. After restriction fragments are 

separated by size, they are identified by hybridization to the probe. 
Hybridization with a unique cloned sequence permits the identification of a 
specific chromosomal region (locus). Because all alleles at a locus are 
detectable, RFLP*s are co-dominant alleles, thereby satisfying a criteria for a 
25 genetic marker. They differ from some other types of markers, e.g., from 

isozymes, in that they reflect the primary DNA sequence, they are not 
products of transcription or translation. Furthermore, different RFLP 
profiles result from different arrays of restriction endonucleases. 

The foregoing examples illustrate particular embodiments of' - 
30 the present invention. It will be readily apparent to a skilled artisan that 
changes, modification and alterations can be made to those embodiments 
without departing from the true scope or spirit of the invention. 
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Example 1: Isolation of Cvanobacterial ACC Polynucleotides 

The polynucleotide of SEQ ID NO: 1 contains a gene that 
encodes the enzyme biotin carboxylase (BC) enzyme from the 
cyanobacterium Anabaena 7120. This gene was cloned from a total DNA 
extract of Anabaena thai was digested with various restriction enzymes, 
fractionated by gel electrophoresis, and blotted onto GeneScreen Plus 
(DuPont). 

The blot was hybridized at low stringency (1 M NaCl, 57° 
C.)with a probe consisting of a Sstll-PstI fragment containing about 90% of 
the coding region of the fabG gene from E. coli. This probe identified a 
3.1-kb Hindlll fragment in the Anabaena digest that contained similar 
sequences. A mixture of about 3-kb Hindlll fragments of Anabaena DNA 
was purified, then digested with Nhel, yielding a Hindlll-Nhel fragment of 
1.6 kb that hybridized with the fabG probe. The 1.6-kb region was purified 
by gel electrophoresis and cloned into pUC18. Plasmid minipreps were 
made from about 160 colonies, of which four were found to contain the 1.6- 
kb Hindlll-Nhel fragment that hybridized with the fabG probe. The 1.6-kb 
Anabaena fragment was then used as probe to screen, at high stringency (1 
M NaCl, 65° C), a cosmid library of Anabaena DNA inserts averaging 40 
kb in size. Five were found among 1920 tested, all of which contained the 
same size Hindlll and Nhel fragments as those identified by the E. coli 
probe previously. From one of the cosmids, the 3. 1-kb Hindlll fragment 
containing the Anabaena fabG gene was subcloned into pUC18 and 
sequenced using the dideoxy chain termination method. The complete 
nucleotide sequence of this fragment is shown in Figure 1 . 

A similar procedure was used to clone the fabG gene from 
Synechococcus. In this case, the initial Southern hybridization showed that 
the desired sequences were contained in part on an 0.8-kb BamHI-PstI 
fragment. This size fragment was purified in two steps and cloned into the 
plasmid Bluescript KS. Minipreps of plasmids from 200 colonies revealed 
two that contained the appropriate fragment of Synechococcus DNA. This 
fragment was used to probe, at high stringency, a library of Synechococcus 
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inserts in the cosmid vector pWB79. One positive clone was found among 
1728 tested. This cosmid contained a 2-kb BamHI and a 3-kb PstI fragment 
that had previously been identified by the E. colifabG probe in digests of 
total Synechococcus DNA. Both fragments were subcloned from the cosmid 
into Bluescript KS and 2.4 kb, including the coding part of thefabG gene, 
were sequenced. The complete sequence of the coding region of the 
Anacystis fabG gene is shown in Figure 2. 

Example 2: Plant ACC 

The amino acid sequences of tiiefabG genes encoding BC 
from Anabaena and Synechococcus are aligned with sequences of ACC and 
other biotin-containing enzymes from several sources in Figure. 3. This 
comparison allows the designation of several areas of significant 
conservation among all the proteins, indicated by stars in the Figure. Based 
on this alignment, the sequences shown in Figure 4 were chosen for the 
construction of primers for the polymerase chain reaction, in order to 
amplify the corresponding region of the gene for ACC from wheat. The 
primers used for this amplification are shown in Figure 4. Each consists of 
a 14-nucleotide specific sequence based on the amino acid sequence and an 
8-nucleotide extension at the 5'-end of the primer to provide anchors for 
rounds of amplification after the first round and to provide convenient 
restriction sites for future analysis and cloning. 

cDNA amplification began with a preparation of total polyA- 
containing mRNA from eight day-old green plants (Triticum aestivum var. 
Era as described in [Lamppa, et a!., 1992]). The first strand of cDNA was 
synthesized using random hexamers as primers for AMV reverse 
transcriptase following procedures described in [Haymerle, et al., 1986], 
with some modifications. Reverse transcriptase was inactivated by 
incubation at 90° C and low molecular weight material was removed by 
filtration through centricon 100. All components of the PCR (from the 
Cetus/Perkin-Elmer kit) together with the two primers shown in Figure 4, 
except the Taq DNA polymerase, were incubated for 3-5 min at 95° C. The 
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PCR was initiated by the addition of polymerase. Conditions were 
established and optimized using Anabaena DNA as template, in order to 
provide the best yield and lowest level of non-specific products for 
amplification of the target BC gene from Anabaena DNA. Amplification 
5 was for 45 cycles, each 1 min at 95°, 1 min at 42-46° and 2 min at 72° C. 

Both the reactions using Anabaena DNA and the single-stranded wheat 
cDNA as template yielded about 440-bp products. The wheat product was 
eluted from a gel and reamplified using the same primers. That product, 
also 440 bp, was cloned into the Invitrogen vector pCRlOOO using their A/T 
10 tail method, and sequenced. The nucleotide sequence is shown in Figure 5. 

In eukaryotic ACCs, the BCCP domain is located about 300 

. ^ amino acids away from the end of the BC domain, on the C-terminal side.. 
Therefore, it is possible to amplify the cDNA covering that interval using 
primers from the C-terminal end of the BC domain and the conserved MKM 
15 region of the BCCP. The BC primer was based on the wheat cDNA 

sequence obtained as described above. These primers, each with 6- or 8- 
base 5'-extensions, are shown in Figure 6B. 

The MKM primer was first checked by determing whether it 
would amplify the fabE gene encoding BCCP from Anabaena DNA. This 

20 PCR was primed at the other end by using a primer based on the N-terminal 

amino acid sequence, determined on protein purified from Anabaena extracts 
by affinity chromatography, shown in Figure 6A. This amplification (using 
the conditions described above)worked, yielding the correct fragment of the 
Anabaena fabE gene, whose complete sequence is shown in Figure 7. 

25 The PCR-amplified fragment of the Anabaena fabE gene was 

used to identify cosmids (three detected in a library of 1920) that contain the 
entire fabE gene and flanking DNA. A 4-kb Xbal fragment containing the 
gene was cloned into the vector Bluescript KS for sequencing. The two 
primers shown in Figure 6 were then used to amplify the intervening 

30 sequence in wheat cDNA. Again, the product of the first PCR was eluted 

and reamplified by another round of PCR, then cloned into the Invitrogen 
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vector pCRII. The complete 1.1 kb of the amplified DNA was sequenced, 
also shown in Figure 5. 

The foregoing examples illustrate particular embodiments of 
the present invention. One of ordinary skill in the art will readily appreciate 
5 that changes, modifications and alterations to those embodiments can be 
made without departing from the true scope or spirit of the invention. 
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Title of Invention: Cyanobacterial and Plant Acetyl-CoA 

Carboxylase 

Number of Sequences: 116 

Correspondence Address: 



(A) 


Addressee : 


Arnold, White & 


Durkee 


(B) 


Street : 


321 North Clark 


Street 


(C) 


City: 


Chicago 




(D) 


State: 


Illinois 




(E) 


Country : 


USA 




<P) 


Zip: 


60610 





(v) Computer Readable Form: 

(A) Medium Type: 

(B) Computer: 

(C) Operating System: 

(D) Software: 

(vi) Current Application Data: 

(A) Application Number: 

(B) Filing Date: 

(C) Classification: 



Floppy Disk 
IBM PC Compatible 
PC-DOS /MS-DOS 
ASCI I -DOS 



07/956,700 

10/21/92 

Unknown 



(vii) Attorney /Agent Information: 

(A) Name: Thomas E . Northrup 

(B) Registration Number: 33,268 

(C) Reference /Docket Number: ARCD:058 

(viii) Telecommunication Information: 



(A) 
<B> 



Telephone: 
Telefax: 



1-312-744-0090 
1-312-755-4489 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) Sequence characteristics: 

(A) Length: 3065 base pairs 

(B) Type: Nucleic acid 

(C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule Type: Oligonucleotide 
(xi) Sequence Description: SEQ ID NO:l: 

AAGCTTTTAT ATTTTGCCAT TTCTAGAACT TAGCTGCATC GGCCCCAAGT ATTTTGTCAA 60 

ATATGGCGAA AAGACTTCAT AAATCAAGGT TAAAGGTTGA CCGTGATGCC AAAACAGGTA 120 

ATGGCGACCC CAGAAAGGCC CATCCACGCC AAAACCTAAT TGCAAGGCCT CTGAATTTCC 180 

GTAATAAATA CCCCGCACAT CCCGATACAA CTCCGTGCGA AGACGAGCTA GACTTGCCCA 240 

AATTGGTAAT GAACGGTTTT GCAAATACTC GTCTACATGG CTGGCTTCCC ACCATGAGGT 300 

TGCATAGGCG AGTCGTTGGC CAGAGCGTGT ACGTAGCCAT ACCTGTCGCC GCAGTCTTGG 360 

CGCTGGAACA GATTGGATTA AATCCGGCGC ACTATCTAAA TCCAAACCAA TCAATGACAT 420 

ATCAATGACA TCGACTTCTG TTGGCTCACC AGTAAG TAAT TCTAAATGCC TTGTGGGTGA 480 

GCCATCACCT AAGAGTAGTA GTTGCCACGC TGGAGCCAGC TGAGTGTGAG GCAAACTATG 540 

TTTAATTACT TCTTCCCCAC CTTGCCAAAT AGGAGTGAGG CGATGCCATC CGGCTGGCAG 600 

TGTTGAGTTG TTGCTTGGAG TAAAAGTGGC AGTCAATGTT CTTTACAAAA GTTCACCTAT 660 

TTATATCAAA GCATAAAAAA TTAATTAGTT GTCAGTTGTC ATTGGTTATT CTTCTTTGCT 720 

CCCCCTGCCC CCTACTTCCC TCCTCTGCCC AATAATTAGA AAGGTCAGGA GTCAAAAACT 780 

TATCACTTTT GACCACTGAC CTTTCACAAT TGACTATAGT CACTAAAAAA TGCGGATGGC 840 

GAGACTCGAA CTCGCAAGGC AAAGCCACAC GCACCTCAAG CGTGCGCGTA TACCAATTCC 900 

GCCACATCCG CACGGGTTGT ACAAGAAGAT ATACTAGCAC AAAAAAATTG CATAAAACAA 960 
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GGTAAAACTA TATTTGCCAA ACTTTATGGA AAATTTATCT TGCTAAATAT ACAAATTTCC 1020 

CGAAGAGGAT ACGAGACTAA CAGAAATGTA GTATCGCCAC AAGTGATATT AAAGGGGGTA 1080 

TGGGGGTTTT CTTCCCTTAC ACCCTTAAAC CCTCACACCC CACCTCCATG AAAAATCTTG 1140 

TTGGTAAGTC CGTTTCCTGC AATTTATTTA AAGATGAGCC TGGGGTATCT CCTGTCATAA 1200 

TTTGAGATGA AGCGATGCCT AAGGCGGCTA CGCTACGCGC TAAAAGCAAC TTGGATGGGA 1260 

GACAATTTCT ATCTGCTGGT ACTGATACTG ATATCGAAAA CTAGAAAATG AAGTTTGACA 1320 

AAATATTAAT TGCCAATCGG GGAGAAATAG CGCTGCGCAT TCTCCGCGCC TGTGAGGAAA 1380 

TGGGGATTGC GACGATCGCA GTTCATTCGA CTGTTGACCG GAATGCTCTT CATGTCCAAC 1440 

f '■ TTGCTGACGA AGCGGTTTGT ATTGGCGAAC CTGCTAGCGC TAAAAGTTAT TTGAATATTC 1500 

CCAATATTAT TGCTGCGGCT TTAACGCGCA ATGCCAGTGC TATTCATCCT GGGTATGGCT 1560 

TTTTATCTGA AAATGCCAAA TTTGCGGAAA TCTGTGCTGA CCATCACATT GCATTCATTG 1620 

GCCCCACCCC AGAAGCTATC CGCCTCATGG GGGACAAATC CACTGCCAAG GAAACCATGC 1680 

AAAAAGCTGG TGTACCGACA GTACCGGGTA GTGAAGGTTT GGTAGAGACA GAGCAAGAAG 1740 

GATTAGAACT GGCGAAAGAT ATTGGCTACC CAGTGATGAT CAAAGCCACG GCTGGTGGTG 1800 

GCGG CCGGGG TATGCGACTG GTG CGATCGC CAGATGAATT TGTCAAACTG TTCTTAGCCG 1860 

CCCAAGGTGA AGCTGGTGCA GCCTTTGGTA ATGCTGGCGT TTATATAGAA AAATTTATTG 1920 

AACGTCCGCG CCACATTGAA TTTCAAATTT TGGCTGATAA TTACGGCAAT GTGATTCACT 1980 

TGGGTGAGAG GGATTGCTCA ATTCAGCGTC GTAACCAAAA GTTACTAGAA GAAGCCCCCA 2040 

GCCCAGCCTT GGACTCAGAC CTAAGGGAAA AAATGGGACA AGCGGCGGTG AAAGCGGCTC 2100 

AGTTTATCAA TTACGCCGGG GCAGGTACTA TCGAGTTTTT GCTAGATAGA TCCGGTCAGT 2160 

TTTACTTTAT GGAGATGAAC ACCCGGATTC AAGTAGAACA TCCCGTAACT GAGATGGTTA 2220 
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CTGGAGTGGA TTTATTGGTT GAGCAAATCA GAATTGCCCA AGGGGAAAGA CTTAGACTAA 2280 

CTCAAGACCA AGTAGTTTTA CGCGGTCATG CGATCGAATG TCGCATCAAT GCCGAAGACC 2340 

CAGACCACGA TTTCCGCCCA GCACCCGGAC GCATTAGCGG TTATCTTCCC CCTGGCGGCC 2400 

CTGGCGTGCG GATTGACTCC CACGTTTACA CGGATTACCA AATTCCGCCC TACTACGATT 2460 

CCTTAATTGG TAAATTGATC GTTTGGGGCC CTGATCGCGC TACTGCTATT AACCGCATGA 2520 

AACGCGCCCT CAGGGAATGC GCCATCACTG GATTACCTAC AACCATTGGG TTTCATCAAA 2580 

GAATTATGGA AAATCCCCAA TTTTTACAAG GTAATGTGTC TACTAGTTTT GTGCAGGAGA 2640 

TGAATAAATA GGGTAATGGG TAATGGGTAA TGGGTAATAG AGTTTCAATC ACCAATTACC 2700 

AATTCCCTAA CTCATCCGTG CCAACATCGT CAGTAATCCT TGCTGGCCTA GAAGAACTTC 2760 

TCGCAACAGG CTAAAAATAC CAACACACAC AATGGGGGTG ATATCAACAC CACCTATTGG 2820 

TGGGATGATT TTTCGCAAGG GAATGAGAAA TGGTTCAGTC GGCCAAGCAA TTAAGTTGAA 2880 

GGGCAAACGG TTCAGATCGA CTTGCGGATA CCAGGTCAGA ATGATACGGA AAATAAACAG 2940 

AAATGTCATC ACTCCCAATA CAGGGCCAAG AATCCAAACG CTCAGGTTAA CACCAGTCAT 3000 

CGATCTAAGC TACTATTTTG TGAATTTACA AAAAACTGCA AGCAAAAGCT GAAAATTTTA 3060 

AGCTT 3065 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) Sequence characteristics: 

(A) Length: 32 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: peptide 

(xi) Sequence Description: SEQ ID NO: 2: 

Asp Glu Ala Met Pro Lys Ala Ala Thr Leu Arg Ala Lys Ser Asn Leu 
5 10 15 

Asp Gly Arg Gin Phe Leu Ser Ala Gly Thr Asp Thr Asp lie Glu Asn 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) Sequence characteristics: 

• --a 

(A) Length: 427 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 3: 

Lys Met Lys Phe Asp Lys lie Leu lie Ala Asn Arg Gly Glu lie Ala 
5 10 15 

Leu Arg He Leu Arg Ala Cys Glu Glu Met Gly He Ala Thr He Ala 
20 25 30 

Val His Ser Thr Val Asp Arg Asn Ala Leu His Val Gin Leu Ala Asp 
35 40 45 

Glu Ala Val Cys He Gly Glu Pro Ala Ser Ala Lys Ser Tyr Leu Asn 
50 55 60 

He Pro Asn He lie Ala Ala Ala Leu Thr Arg Asn Ala Ser Ala He 
65 70 75 80 

His Pro Gly Tyr Gly Phe Leu Ser Glu Asn Ala Lys Phe Ala Glu He 
85 90 95 

Cys Ala Asp His His He Ala Phe He Gly Pro Thr Pro Glu Ala He 
100 105 110 

Arg Leu Met Gly Asp Lys Ser Thr Ala Lys Glu Thr Met Gin Lys Ala 
115 120 125 

Gly Val Pro Thr Val Pro Gly Ser Glu Gly Leu Val Glu Thr Glu Gin 
130 135 140 



( 
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Glu Gly Leu Glu Leu Ala Lys Asp lie Gly Tyr Pro Val Met lie Lys 
145 150 155 160 

Ala Thr Ala Gly Gly Gly Gly Arg Gly Met Arg Leu Val Arg Ser Pro 
165 170 175 

Asp Glu Phe Val Lys Leu Phe Leu Ala Ala Gin Gly Glu Ala Gly Ala 
180 185 190 

Ala Phe Gly Asn Ala Gly Val Tyr lie Glu Lys Phe- He Glu Arg Pro 
195 200 205 

Arg His He Glu Phe Gin He Leu Ala Asp Asn Tyr Gly Asn Val He 
210 215 220 

His Leu Glu Arg Asp Cys Ser He Gin Arg Arg Asn Gin Lys Leu Leu 
225 230 235 240 

Glu Glu Ala Pro Ser Pro Ala Leu Asp Ser Asp Leu Arg Glu Lys Met 
245 250 255 

Gly Gin Ala Ala Val Lys Ala Ala Gin Phe He Asn Tyr Ala Gly Ala 
260 265 270 

Gly Thr He Glu Phe Leu Leu Asp Arg Ser Gly Gin Phe Gly Val Asp 
275 280 285 

Leu Leu Val Glu Gin He Arg He Ala Gin Gly Glu Arg Leu Arg Leu 
290 295 300 

Thr Gin Asp Gin Val Val Leu Arg Gly His Ala He Glu Cys Arg He 
305 310 315 320 

Asn Ala Glu Asp Pro Asp His Asp Phe Arg Pro Ala Pro Gly Arg He 
325 330 335 

ser Gly Tyr Leu Pro Pro Gly Gly Pro Gly Val Arg He Asp Ser His 
340 345 350 

Val Tyr Thr Asp Tyr Gin He Pro Pro Tyr Tyr Asp Ser Leu He Gly 
355 360 365 

Lys Leu He Val Trp Gly Pro Asp Arg Ala Thr Ala He Asn Arg Met 
370 - 375 380 

Lys Arg Ala Leu Arg Glu Cys Ala He Thr Gly Leu Pro Thr Thr He 
385 390 395 400 

Gly Phe His Gin Arg He Met Glu Asn Pro Gin Phe Leu Gin Gly Asn 
405 410 415 

Val Ser Thr Ser Phe Val Gin Glu Met Asn Lys 
420 42S 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(1) Sequence characteristics: 

(A) Length: 36 amino acids 

(B) Type: Amino acid 

(C) Strandedness : Single 

( D ) Topo logy : Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 4: 

Trp Val Met Gly Asn Arg Val Ser lie Thr Asn Tyr Gin Phe Pro Asn 
5 10 15 

Ser Ser Val Pro Thr Ser Ser Val lie Leu Ala Gly Leu Glu Glu Leu 
20 25 30 

Leu Ala Thr Gly 
35 



(2) INFORMATION FOR SEQ ID NO: 5 : 

(i) Sequence characteristics: 

<A) Length: 1342 base pairs 

(B) Type: Nucleic acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Oligonucleotide 



(xi) Sequence Description: SEQ ID NO: 5: 



ATGCGTTTCA 


ACAAGATCCT 


GATCGCCAAT 


CGCGGCGAAA 


TCGCCCTGCG 


CATTCTCCGC 


60 


ACTTGTCAAG 


AACTCGGGAT 


CGGCACGATC 


GCCGTTCACT 


CCACTGTGGA 


TCGCAACGCG 


120 


CTCCATGTGC 


AGTTAGCGGA 


CGAAGCGGTC 


TGTATTGGCG 


AAGCGGCCAG 


C AG C AAAAGC 


180 


TATCTCAATA 


TCCCCAACAT 


CATTGCGGCG 


GCCCTGACCC 


CTAATGCCAG 


CGCCATTCAC 


240 


CCCGGCTATG 


GCTTCTTGGC 


GGAGAATGCC 


CGCTTTGCAG 


AAATCTGCGC 


CGATCACCAT 


300 


CTCACCTTTA 


TTGGCCCCAG 


CCCCGATTCG 


ATTCGAGCCA 


TGGGCGATAA 


ATCCACCGCT 


360 


AAGGAAACAA 


TGCAGCGGGT 


CGGCGTTCCG 


ACGATTCCGG 


GCAGTGACGG 


TCTGCTGACG 


400 


GATGTTGATT 


CGGCTGCCAA 


AGTTGCTGCC 


GAGATCGGCT 


ATCCCGTCAT 


GATCAAAGCG 


460 


ACGGCGGGGG 


GCGGTGGTCG 


CGGTATGCGG 


CTGGTGCGTG 


ACCCTGCAGA 


TCTGGAAAAA 


520 


CTGTTCCTTG 


CTGCCCAAGG 


AGAAGCCGAG 


GCAGCTTTTG 


GGAATCCAGG 


ACTGTATCTC 


580 


GAAAAATTTA 


TCGATCGCCC 


ACGCCACGTT 


GAATTTCAGA 


TCTTGGCCGA 


TGCCTACGGC 


640 


AATGTAGTGC 


ATCTAGGCGA 


GCGCGATTGC 


TCCATTCAAC 


GTCGTCACCA 


AAAGCTGCTC 


700 


GAAGAAGCCC 


CCAGTCCGGC 


GCTATCGGCA 


GACCTGCGGC 


AGAAAATGGG 


CGATGCCGCC 


760 
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GTCAAAGTCG 


CTCAAGCGAT 


CGGCTACATC 


GGTGCCGGCA 


CCG TGGAGTT 


TCTGGTCGAT 


820 


GCGACCGGCA 


ACTTCTACTT 


CATGGAGATG 


AATACCCGCA 


TCCAAGTCGA 


GCATCCAGTC 


900 


ACAGAAATGA 


TTACGGGACT 


GGACTTGATT 


GCGGAGCAGA 


TTCGGATTGC 


CCAAGGCGAA 


960 


GCGCTGCGCT 


TCCGGCAAGC 


CGATATTCAA CTGCGCGGCC 


ATGCGATCGA 


ATGCCGTATC 


1020 


AATGCGGAAG 


ATCCGGAATA 


CAATTTCCGG 


CCGAATCCTG 


GCCGCATTAC 


AGGCTATTTA 


1080 


CCGCCCGGCG 


GCCCCGGCGT 


TCGTGTCGAT 


TCCCATGTTT 


ATACCGACTA 


CGAAATTCCG 


1140 


CCCTATTACG 


ATTCGCTGAT 


TGGCAAATTG 


ATTGTCTGGG 


GTGCAACACG 


GGAAGAGGCG 


1200 


ATCGCGCGGA 


TGCAGCGTGC 


TCTGCGGGAA 


TGCGCCATCA 


CCGGCTTGCC 


GACGACCCTT 


1260 


AGTTTCCATC 


AGCTGATGTT 


GCAGATGCCT 


GAGTTCCTGC 


GCGGGGAACT 


CTATACCAAC 


1300 


TTTGTTGAGC 


AGGTGATGCT 


ACCTCGGATC 


CTCAAGTCCT 


AG 




1342 



(2) INFORMATION FOR SEQ ID NOt6: 

(i) Sequence characteristics: 

(A) Length: 453 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 6: 

Met Arg Phe Asn Lys lie Leu lie Ala Asn Arg Gly Glu lie Ala Leu 
5 10 15 

Arg He Leu Arg Thr Cys Glu Glu Leu . Gly He Gly Thr He Ala Val 
20 25 30 

His Ser Thr Val Asp Arg Asn Ala Leu His Val Gin Leu Ala Asp Glu 
35 40 45 

Ala Val Cya He Gly Glu Ala Ala Ser Ser Lys Ser Tyr Leu Asn He 
50 55 60 

Pro Asn He He Ala Ala Ala Leu Thr Arg Asn Ala Ser Ala He His 
65 70 75 80 

Pro Gly Tyr Gly Phe Leu Ala Glu Asn Ala Arg Phe Ala Glu He Cys 
85 90 95 

Ala Asp His His Leu Thr Phe He Gly Pro Ser Pro Asp Ser He Arg 
100 105 ~110 

Ala Met Gly Asp Lys Ser Thr Ala Lys Glu Thr Met Gin Arg Val Gly 
115 120 125 

Val Pro Thr He Pro Gly Ser Asp Gly Leu Leu Thr Asp Val Asp Ser 
130 135 140 
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Ala Ala Lys Val Ala Ala Glu He Gly Tyr Pro Val Met lie Lys Ala 
145 150 155 160 

Thr Ala Gly Gly Gly Gly Arg Gly Met Arg Leu Val Arg Glu Pro Ala 
165 n 170 175 

Asp Leu Glu Lys Leu Phe Leu Ala Ala Gin Gly Glu Ala Glu Ala Ala 
180 185 190 

Phe Gly Asn Pro Gly Leu Tyr Leu Glu Lys Phe He Asp Arg Pro Arg 
195 ' 200 205 

His Val Glu Phe Gin He Leu Ala Asp Ala Tyr Gly Asn Val Val Glu 
210 215 220 

Leu Gly Glu Arg Asp Cys Ser He Gin Arg Arg His Gin Lys Leu Leu 
225 230 235 240 

Glu Glu Ala Pro Ser Pro Ala Leu Ser Ala Asp Leu Arg Gin Lys Met 
245 250 255 

Gly Asp Ala Ala Val Lys Val Ala Gin Ala He Gly Tyr He Gly Ala 
260 265 270 

Gly Thr Val Glu Phe Leu Val Asp Ala Thr Gly Asn Phe Tyr Phe Met 
275 280 285 

Glu Met Asn Thr Arg He Gin Val Glu His Pro Val Thr Glu Met He 
290 295 300 

Thr Gly Leu Asp Leu He Ala Glu Gin He Arg He Ala Gin Gly Glu 
305 310 315 320 

Ala Leu Arg Phe Arg Gin Ala Asp He Gin Leu Arg Gly His Ala He 
325 330 335 

Glu Cys Arg He Asn Ala Glu Asp Pro Glu Tyr ABn Phe Arg Pro Asn 
340 345 350 

Pro Gly Arg He Thr Gly Tyr Leu Pro Pro Gly Gly Pro Gly Val Arg 
355 360 265 

Val Asp Ser His Val Tyr Thr Asp Tyr Glu He Pro Pro Tyr Tyr Asp 
370 375 380 

Ser Leu He Gly Lys Leu He Val Trp Gly Ala Thr Arg Glu Glu Ala 
385 390 395 400 

He Ala Arg Met Gin Arg Ala Leu Arg Glu Gly Ala He Thr Gly Leu 
405 410 415 

Pro Thr Thr Leu Ser Phe His Gin Leu Met Leu Gin Met Pro Glu Phe 
420 425 430 

Leu Arg Gly Glu Leu Tyr Thr Asn Phe Val Glu Gin Val Met Leu Pro 
435 440 445 

Arg He Leu Lys Ser 
450 
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(2) INFORMATION FOR SEQ ID NO: 7: 

(i) Sequence characteristics: 

(A) Length: 34 amino acids 

( 5 ) Type: Amino acid 

(C) strandedneee : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi> Sequence Description: SEQ ID NO: 7: 

Met Asp Glu Pro Ser Pro Leu Ala Lys Thr Leu Glu Leu Asn Gin Hie 
5 10 15 

Ser Arq Phe lie He Gly Ser Val Ser Glu Asp Asn Ser Glu Asp Glu 
20 25 30 

He Ser 



(2) INFORMATION FOR SEQ ID NO: 8 I 

(i) Sequence characteristics: 

(A) Length: 187 amino acids 

(B) Type: Amino acid 

(C) Strandedness : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 8: 

Asn Leu Val Lys Leu Asp Leu Glu Glu Lys Glu Gly Ser Leu Ser Pro 
5 10 15 

Ala Ser Val Ser Ser Asp Thr Leu Ser Asp Leu Gly He Ser Ala Leu 
20 25 30 

Gin Asp Gly Leu Ala Phe His Met Arg Ser Ser Met Ser Gly Leu His 
35 40 45 

Leu Val Lys Gin Gly Arg Asp Arg Lys Lys He Asp Ser Gin Arg Asp 
50 55 60 

Phe Thr Val Ala Ser Pro Ala Glu Phe Val Thr Arg Phe Gly Gly Asn 
65 70 75 80 

Lvs Val He Glu Lys Val Leu He Ala Asn Asn Gly He Ala Ala Val 
* 85 90 95 

Lys cys Met Arg Ser He Arg Arg Trp Ser Tyr Glu Met Phe Arg Asn 
100 105 HO 

Glu Arg Ala He Arg Phe Val Val Met Val Thr Pro Glu Asp Leu Lys 
115 120 125 
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Ala Aan Ala Glu Tyr He Lys Met Ala Asp His Tyr Val Pro Val Pro 
130 135 140 

Gly Gly Ala Asn Asn Asn Asn Tyr Ala Aen Val Glu Leu He Leu Asp 
145 150 155 160 

He Ala Lys Arg He Pro Val Gin Ala Val Trp Ala Gly Trp Gly His 
165 170 175 

Ala Ser Glu Asn Pro Lys Leu Pro Glu Leu Leu 
160 185 



(2) INFORMATION FOR SEQ ID NOi9l 

(i) Sequence characteristics: 

(A) Length: 122 amino acids 

(B) Type: Amino acid 

(C) strandednessi Single 
{D ) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 9: 

Leu Lys Asn Gly He Ala Phe Met Gly Pro Pro Ser Gin Ala Met Trp 
5 10 15 

Ala Leu Gly Asp Lys He Ala Ser Ser He Val Ala Gin Thr Ala Gly 
20 25 30 

He Pro Thr Leu Pro Trp Ser Gly Ser Gly Leu Arg Val Asp Trp Gin 
35 40 45 

Glu Asn Ab P Phe Ser Lys Arg He Leu Asn Val Pro Gin Asp Leu Tyr 
40 55 60 

Glu Lys Gly Tyr Val Lys Asp Val Asp Asp Gly Leu Lys Ala Ala Glu 
65 70 75 80 

Glu Val Gly Tyr Pro Val Met He Lys Ala Ser Glu Gly Gly Gly Gly 
85 90 95 

Lys Gly He Arg Lys Val Asn Asn Ala Asp Asp Phe Pro Asn Leu Phe 
100 105 HO 

Arg Gin Val Gin Ala Glu Val Pro Gly Ser 
115 120 
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(2) INFORMATION FOR SEQ ID NO: 10: 

(i) Sequence characteristics: 

(A) Length: 86 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 10: 

Pro lie Phe Val Met Arg Leu Ala Lys Gin Ser Arg His Leu Glu Val 
5 10 15 

Gin He Leu Ala Asp Gin Tyr Gly Asn Ala He Ser Leu Phe Gly Arg 

20 ~ 25 30 

Asp Cys Ser Val Gin Arg Arg His Gin Lys He He Glu Glu Ala Pro 
35 40 45 

Ala Ala He Ala Thr Pro Ala Val Phe Glu His Met Glu Gin Cys Ala 
50 55 60 

Val Lys Leu Ala Lys Met Val Gly Tyr Val Ser Ala Gly Thr Val Glu 
65 70 75 80 

Tyr Leu Tyr Ser Gin Asp 
85 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) Sequence characteristics: 

(A) Length: 70 amino acids 

( B ) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 11: 

Glv Ser Phe Tyr Phe Leu Glu Leu Asn Pro Arg Leu Gin Val Glu His 
5 10 15 

Pro Cys Thr Glu Met Val Ala Asp Val Asn Leu Pro Ala Ala Gin Leu 
20 25 30 

Gin He Ala Met Gly He Pro Leu Phe Arg He Lys Asp He Arg Met 
35 40 45 

Met Tyr Gly Val Ser Pro Trp Gly Asp Ala Pro He Asp Phe Glu Asn 
50 55 60 

Ser Ala His Val Pro Cys 
65 70 
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(2) INFORMATION FOR SEQ ID NO: 12: 

(i) Sequence characteristics: 

<A) Length: 20 amino acids 

(B) Type: Amino acid 

(C) strandedneas: Single 

( D ) Topo logy : Linear 

<ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 12: 

Pro Arg Gly His Val He Ala Ala Arg He Thr Ser Glu Asn Pro Asp 
5 10 15 

Glu Gly Phe Lys 
20 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) Sequence characteristics: 

(A) Length: 21 amino acids 

(B) Type: Amino acid 

(C) Strandedness : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 13: 

Pro Ser Ser Gly Thr Val Gin Glu Leu Asn Phe Arg Ser Asn Lys Asn 
5 10 15 

Val Trp Gly Tyr Phe 
20 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) Sequence characteristics: 

(A) Length: 122 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description,: SEQ ID NO: 14: 

Ser Val Ala Ala Ala Gly Gly Leu His Glu Phe Ala Asp Ser Gin Phe 
5 10 15 

Gly His Cys Phe Ser Trp Gly Glu Asn Arg Glu Glu Ala He Ser Asn 
20 25 30 
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Met Val Val Ala Leu Lye Glu Leu 

35 40 

Thr Val Glu Tyr Leu lie Lys Leu 

SO 55 

Aan Arg lie Asp Thr Gly Trp Leu 

65 70 

Gin Ala Glu Arg Pro Asp Thr Met 
85 

His Val Ala Asp Val Aan Leu Arg 
100 

Ser Leu Glu Arg Gly Gin Val Leu 

115 120 



60 

Ser lie Arg Gly Asp Phe Arg Thr 
45 

Leu Glu Thr Glu Ser Phe Gin Leu 
60 

Asp Arg Leu lie Ala Glu Lys Val 
75 SO 

Leu Gly Val Val Cys Gly Ala Leu 
90 95 

Asn Ser lie Ser Aan Phe Leu His 
105 HO 

Pro Ala 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) sequence characteristics: 

(A) Length: 190 amino acids 

(B) Type: Amino acid 

(C) strandedness : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 15: 

His Thr Leu Leu Asn Thr Val Asp Val Glu Leu lie Tyr Glu Gly lie 
5 10 15 

Lys Tyr Val Leu Lys Val Thr Arg Gin Ser Pro Asn Ser Tyr Val Val 
20 25 30 

He Met Asn Gly Ser Cys Val Glu Val Asp Val His Arg Leu Ser Asp 
35 40 45 

Gly Gly Leu Leu Leu Ser Tyr Asp Gly Ser Ser Tyr Thr Thr Tyr Met 
50 55 60 

Lye Glu Glu Val Asp Arg Tyr Arg He Thr He Gly Asn Lys Thr Cys 
65 70 75 80 

Val Phe Glu Lys Glu Asn Asp Pro Ser Val Met Arg Ser Pro Ser Ala 
85 90 95 

Gly Lys Leu He Gin Tyr He Val Glu Asp Gly Gly His Val Phe Ala 
100 105 HO 

Gly Gin Cys Tyr Ala Glu He Glu Val Met Lys Met Val Met Thr Leu 
115 120 125 

Thr Ala Val Glu Ser Gly Cys He His Tyr Val Lys Arg Pro Gly Ala 
130 135 140 

Ala Leu Asp Pro Gly Cys Val He Ala Lys Met Gin Leu Asp Asn Pro 
145 150 155 160 
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Ser Lys Val Gin Gin Ala Glu Leu His Thr Gly Ser Leu Pro Gin He 

165 170 175 

Gin Ser Thr Ala Leu Arg Gly Glu Lys Leu His Arg He Phe 

180 185 190 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) Sequence characteristics: 

<A) Length: 37 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 
( D > Topology : Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ZD NO: 16: 

Val Met He Lys Ala Ser Trp Gly Gly Gly Gly Lys Gly He Arg Lys 
5 10 15 

Val His Asn Asp Asp Glu Val Arg Ala Leu Phe Lys Gin Val Gin Gly 
20 25 30 

Glu Val Pro Gly Ser 
35 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) Sequence characteristics: 

(A) Length: 187 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 17; 

Pro He Phe He Met Lys Val Ala Ser Gin Ser Arg His Leu Glu Val 
5 10 15 

Gin Leu Leu Cys Asp Lys His Gly Asn Val Ala Ala Leu His Ser Arg 
20 25 30 

Asp Cys Ser Val Gin Arg Arg His Gin Lys He He Glu Glu Gly Pro 
35 40 45 

He Thr Val Ala Pro Pro Glu Thr He Lys Glu Leu Glu Gin Ala Ala 
50 55 60 

Arg Arg Leu Ala Lys Cys Val Gin Tyr Gin Gly Ala Ala Thr Val Glu 
65 70 75 80 

Tyr Leu Tyr Ser Met Glu Thr Gly Glu Tyr Tyr Phe Leu Glu Leu Asn 
85 90 95 
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Pro Arg Leu Gin Val Glu Hie Pro Val Thr Glu Trp lie Ala Glu lie 
100 105 110 

Aen Leu Pro Ala Ser Gin Val Val Val Gly Met Gly He Pro Leu Tyr 
US 120 125 

Asn He Pro Glu He Arg Arg Phe Tyr Gly He Glu His Gly Gly Gly 
130 135 140 

Tyr His Ala Trp Lys Glu He Ser Ala Val Ala Thr Lys Phe Asp Leu 
145 150 155 160 

Asp Lys Ala Gin Ser Val Lys Pro Lys Gly His Cys Val Ala Val Arg 

* 165 170 175 

Val Thr Ser Glu Asp Pro Asp Asp Gly Phe Lys 
180 185 



(2) INFORMATION FOR SEQ ID NO: IB: 

(i) Sequence characteristics: 

(A) Length: 21 amino acids 

(B) Type: Amino acid 
jc) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 18: 

Pro Thr Ser Gly Arg Val Glu Glu Leu Asn Phe Lys Ser Lys Pro Asn 
5 10 15 

Val Trp Ala Tyr Phe 
20 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) Sequence characteristics: 

(A) Length: 122 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 19: 

Ser Val Lys Ser Gly Gly Ala He His Glu Phe Ser Asp Ser Gin Phe 
5 10 15 

Gly His Val Phe Ala Phe Gly Glu Ser Arg Ser Leu Ala He Ala Asn 
20 25 30 

Met Val Leu Gly Leu Lys Glu He Gin He Arg Gly Glu He Arg Thr 
35 40 45 
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Asn Val Asp Tyr Thr Val Asp Leu Leu Asn Ala Ala Glu Tyr Arg Glu 
50 55 60 

Asn Met He His Thr Gly Trp Leu Asp Ser Arg lie Ala Met Arg Val 
65 70 75 80 

Arg Ala Glu Arg Pro Pro Trp Tyr Leu Ser Val Val Gly Gly Ala Leu 
85 90 95 

Tyr Glu Ala Ser Ser Arg Ser Ser Ser Val Val Thr Asp Tyr Val Gly 
100 105 110 

Tyr Leu Ser Lys Gly Gin He Pro Pro Lys 
110 ' 120 



(2) INFORMATION FOR SEQ ID NO: 20: 

<i) Sequence characteristics: 

(A) Length: 124 amino acids 

(B) Type: Amino acid 

(C) Strandednees : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 20: 

His He Ser Leu Val Asn Leu Thr Val Thr Leu Asn He Asp Gly Ser 
5 10 15 

Lys Tyr Thr He Glu Thr Val Arg Gly Gly Pro Arg Ser Tyr Lys Leu 
20 25 30 

Arg He Asn Glu Ser Glu Val Glu Ala Glu He His Phe Leu Arg Asp 
35 40 45 

Gly Gly Leu Leu Met Gin Leu Asp Gly Asn Ser His Val He Tyr Ala 
50 55 60 

Glu Thr Glu Ala Ala Gly Thr Arg Leu Leu He Asn Gly Arg Thr Cys 
65 70 75 80 

Leu Leu Gin Lys Glu His Asp Pro Ser Arg Leu Leu Ala Asp Thr Pro 
85 90 95 

Cys Lys Leu Leu Arg Phe Leu Val Ala Asp Gly Ser His Val Val Ala 
100 105 110 

Asp Thr Pro Tyr Ala Glu Val Glu Ala Met Lys Met 
115 120 
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(2) INFORMATION FOR SEQ ID NO: 21: 

(i) Sequence characteristics: 

(A) Length: 222 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 21: 

Met Glu Glu Ser Ser Gin Pro Ala Lys Pro Leu Glu Met Asn Pro His 
5 10 15 

Ser Arg Phe lie lie Gly Ser Val Ser Glu Asp Asn Ser Glu Asp Glu 
20 25 30 

Thr Ser Ser Leu Val Lys Leu Asp Leu Leu Glu Glu Lys Glu Arg Ser 
35 40 45 

Leu Ser Pro Val Ser Val Cys Ser Asp Ser Leu Ser Asp Leu Gly Leu 
50 55 60 

Pro Ser Ala Gin Asp Gly Leu Ala Asn His Met Arg Pro Ser Met Ser 
65 70 75 80 

Gly Leu His Leu Val Lys Gin Gly Arg Asp Arg Lys Lys Val Asp Val 
85 90 95 

Gin Arg Asp Phe Thr Val Ala Ser Pro Ala Glu Phe Val Thr Arg Phe 
100 105 110 

Gly Gly Asn Arg Val lie Glu Lys Val Leu lie Ala Asn Asn Gly lie 
115 120 125 

Ala Ala Val Lys Cys Met Arg Ser lie Arg Arg Trp Ser Tyr Glu Met 
130 135 140 

Phe Arg Asn Glu Arg Ala He Arg Phe Val Val Met Val Thr Pro Glu 
145 150 155 160 

Asp Leu Lys Ala Asn Ala Glu Tyr He Lys Met Ala Asp His Tyr Val 
165 170 175 

Pro Val Pro Gly Gly Pro Asn Asn Asn Asn Tyr Ala Asn Val Glu Leu 
180 185 190 

He Leu Asp He Ala Lys Arg He Pro Val Gin Ala Val Trp Ala Gly 
195 200 205 

Trp Gly His Ala Ser Glu Asn Pro Lys Leu Pro Glu Leu Leu 
210 215 220 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) Sequence characteristics: 

(A) Length: 122 amino acids 

( b ) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 22: 

His Lys Asn Gly lie Ala Phe Met Gly Pro Pro Ser Gin Ala Met Trp 
5 10 15 

Ala Leu Gly Asp Lys He Ala Ser Ser He Val Ala Gin Thr Ala Gly 
20 25 30 

He Pro Thr Leu Pro Trp Asn Gly Ser Gly Leu Arg Val Asp Trp Gin 
35 40 45 

Glu Asn Asp Leu Gin Lys Arg He Leu Asn Val Pro Gin Glu Leu Tyr 
50 55 60 

Glu Lys Gly Tyr Val Lys Asp Ala Asp Asp Gly Leu Arg Ala Ala Glu 
65 70 75 80 

Glu Val Gly Tyr Pro Val Met He Lys Ala Ser Glu Gly Gly Gly Gly 
85 90 95 

Lvs Glv He Arg Lys Val Asn Asn Ala Asp Asp Phe Pro Asn Leu Phe 
100 105 HO 

Arg Gin Val Gin Ala Glu Val Pro Gly Ser 
115 120 



(2) INFORMATION FOR SEQ ID NO: 23 x 

(i) Sequence characteristics: 

<A) Length: 95 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 23: 

Pro He Phe Val Met Arg Leu Ala Lys Gin Ser Arg His Leu Glu Val 
5 10 15 

Gin He Leu Ala Asp Gin Tyr Gly Asn Ala He Ser Leu Phe Gly Arg 
20 25 30 

Asp Cys Ser Val Gin Arg Arg His Gin Lys He He Glu Glu Ala Gly 
35 40 45 
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Leu Arg Ala Ala Glu Glu Val Gly Tyr Pro Val Met He Lye Ala Ser 
50 55 60 

Glu Gly Gly Gly Gly Lys Gly He Arg Lys Val Aan Asn Ala Asp Asp 
65 70 75 80 

Phe Pro Asn Leu Phe Arg Gin Val Gin Ala Glu Val Pro Gly Ser 
80 90 95 



<2) INFORMATION FOR SBQ ID NO* 24: 

(i) Sequence characteristics: 

(A) Length: 86 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D ) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 24: 

Pro He Phe Val Met Arg Leu Ala Lys Gin Ser Arg His Leu Glu Val 
5 10 15 

Gin He Leu Ala Asp Gin Tyr Gly Asn Ala He Ser Leu Phe Gly Arg 
20 25 30 

Asp Cys Ser Val Gin Arg Arg His Gin Lys He He Glu Glu Ala Pro 
35 40 45 

Ala Ser He Ala Thr Ser Val Val Phe Glu His Met Glu Gin Cys Ala 
50 55 60 

Val Lys Leu Ala Lys Met Val Gly Tyr Val Ser Ala Gly Thr Val Glu 
65 70 75 80 

Tyr Leu Tyr Ser Gin Asp 
85 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) Sequence characteristics: 

(A) Length: 70 amino acids 

(B) Type: Amino acids 

(C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 25: 

Gly Ser Phe Tyr Phe Leu Glu Leu Asn Pro Arg Leu Gin Val Glu His 
5 10 15 

Pro Cys Thr Glu Met Val Ala Asp Val Asn Leu Pro Ala Ala Gin Leu 
20 25 30 
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Gin lie Ala Met Gly lie Pro Leu His Arg lie Lys Asp He Arg Val 
35 40 45 

Met Tyr Gly Val Ser Pro Trp Gly Asp Gly Ser lie Asp Phe Glu Asn 
50 35 60 

Ser Ala His Val Pro Cys 
65 70 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) Sequence characteristics: 

(A) Length: 20 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 26: 

Pro Arg Gly His Val He Ala Ala Arg He Thr Ser Glu Asn Pro Asp 
5 * 10 15 

Glu Gly Phe Lys 
20 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) Sequence characteristics: 

(A) Length: 21 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 27: 

Pro Ser Ser Gly Thr Val Gin Glu Leu Asn Phe Arg Ser Asn Lys Asn 
5 10 15 

Val Trp Gly Tyr Phe 
20 
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(2) INFORMATION FOR SEQ ID NO:28t 

(i) Sequence characteristics: 

(A) Length: 122 amino acids 

(B) Type: Amino acid 

(C) Strandedness : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 28: 

Ser Val Ala Ala Ala Gly Gly Leu His Glu Phe Ala Asp Ser Gin Phe 
5 10 15 

Gly His Cys Phe Ser Trp Gly Glu Asn Arg Glu Glu Ala lie Ser Asn 
20 25 30 

Met Val Val Ala Leu Lys Glu Leu Ser lie Arg Gly Asp Phe Arg Thr 
35 40 45 

Thr Val Glu Tyr Leu lie Lys Leu Leu Glu Thr Glu Ser Phe Gin Gin 
50 " 55 60 

Asn Arg lie Asp Thr Gly Trp Leu Asp Arg Leu lie Ala Glu Lys Val 
65 70 75 80 

Gin Ala Glu Arg Pro Asp Thr Met Leu Gly Val Val Cys Gly Ala Leu 
85 90 95 

His Val Ala Asp Val Ser Phe Arg Asn Ser Val Ser Asn Phe Leu His 
100 105 110 

Ser Leu Glu Arg Gly Gin Val Leu Pro Ala 
115 120 



(2) INFORMATION FOR SEQ ID NO: 29: 

<i) Sequence characteristics: 

(A) Length: 90 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

<ii> Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO:29: 

Met Val Val Ala Leu Lys Glu Leu Ser He Arg Gly Asp Phe Arg Thr 
5 10 15 

Thr Val Glu Tyr Leu He Lys Leu Leu Glu Thr Glu Ser Phe Gin Gin 
20 25 30 

Asn Arg He Asp Thr Gly Trp Leu Asp Arg Leu He Ala Glu Lys Val 
35 40 45 
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Gin Ala Glu Arg Pro Asp Thr Met Leu Gly Val Val Cys Gly Ala Leu 
50 55 60 

His Val Ala Asp Val Ser Phe Arg Asn Ser Val Ser Asn Phe Leu His 
65 70 . 75 80 

Ser Leu Glu Arg Gly Gin Val Leu Pro Ala 
85 90 



(2) INFORMATION FOR SEQ ID HO: 30: 

(i) Sequence characteristics: 

(A) Lengths 190 amino acids 

(B) Type: Amino acid 

(C) strandedneBs: Single 

( D ) Topology : Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 30: 

His Thr Leu Leu Asn Thr Val Asp Val Glu Leu He Tyr Glu Gly Arg 
5 10 15 

Lys Tyr Val Leu Lys Val Thr Arg Gin Ser Pro Asn ser Tyr Val Val 
20 25 30 

He Met Asn Ser Ser Cys Val Glu Val Asp Val His Arg Leu Ser Asp 
35 40 45 

Gly Gly Leu Leu Leu Ser Tyr Asp Gly Ser Ser Tyr Thr Thr Tyr Met 
50 55 60 

Lys Glu Glu Val Asp Arg Tyr Arg He Thr He Gly Asn Lys Thr Cys 
65 70 75 80 

Val Phe Glu Lys Glu Asn Asp Pro Ser He Leu Arg Ser Pro Ser Ala 
85 90 95 

Gly Lys Leu He Gin Tyr Val Val Glu Asp Gly Gly His Val Phe Ala 
100 105 HO 

Gly Gin Cys Phe Ala Glu He Glu Val Met Lys Met Val Met Thr Leu 
115 120 125 

Thr Ala Gly Glu Ser Gly Cys He His Tyr Val Lys Arg Pro Gly Ala 
130 135 140 

Val Leu Asp Pro Gly Cys Val He Ala Lys Leu Gin Leu ABp Asp Pro 
145 150 155 160 

Ser Arg Val Gin Gin Ala Glu Leu His Thr Gly Thr Leu Pro Gin He 
165 170 175 

Gin Ser Thr Ala Leu Arg Gly Glu Lys Leu His Arg He Phe 
180 185 190 
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(2) 



INFORMATION FOR SEQ ID NO: 31: 



(i) 



Sequence characteristics: 



(A) Length: 41 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 



(i-i) 



Molecule type: Peptide 



(xi) Sequence Description: SEQ ID NO: 31: 

Met Ser Glu Glu Ser Leu Phe Glu Ser Ser Pro Gin Lys Met Glu Tyr 
5 10 15 

Glu He Thr Asn Tyr Ser Glu Arg His Thr Glu Leu Pro Gly His Phe 
20 25 30 

He Gly Leu Asn Thr Val Asp Lys Leu 
35 40 

<2) INFORMATION FOR SEQ ID NO: 32: 

(i) Sequence characteristics: 



(xi) Sequence Description: SEQ ID NO: 32: 

Ala Asp Val Asp Ala Val Trp Ala Gly Trp Gly His Ala Ser Glu Asn 
5 10 15 

Pro Leu Leu Pro Glu Lys Leu Ser Gin Ser Lys Arg Lys Val He Phe 
20 25 30 

He Gly Pro Pro Gly Asn Ala Met Arg Ser Leu Gly Asp Lys He Ser 
35 40 45 

Ser Thr Thr He Val Ala Gin Ser Ala Lys Val Pro Cys He Pro Trp 
50 55 60 

Ser Gly Thr Thr Gly Val Asp Thr Val His 
65 70 



(A) Length: 74 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 
{D ) Topology: Linear 



(ii) 



Molecule type: Peptide 
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(2) INFORMATION FOR SEQ ID NO: 33: 

(i) Sequence characteristics: 

(A) Length: 73 amino acids 

(B) Type: Amino acid 

(C) strandednees : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 33: 

Val Asp Glu Lys Thr Gly Leu Val Ser Val Asp Asp Asp lie Tyr Gin 
5 10 15 

Lys Gly Cys Cys Thr Ser Pro Glu Asp Gly Leu Gin Lys Ala Lys Arg 
20 25 30 

lie Gly Phe Pro Val Met *lle Lys Ala Ser Glu Gly Gly Gly Gly Lye 
35 40 45 

Gly He Arg Gin Val Glu Arg Glu Glu Asp Phe He Ala Leu Tyr His 
50 55 60 

Gin Ala Ala Asn Glu He Pro Gly Ser 
65 70 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) Sequence characteristics: 

(A) Length: 157 amino acids 

(B) Type: Amino acid 

(C) Strandedness : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 34: 

Pro He Phe He Met Lys Leu Ala Gly Arg Ala Arg His Leu Glu Val 
5 10 15 

Gin Leu Leu Ala Asp Gin Tyr Gly Thr Asn He Ser Leu Phe Gly Arg 
20 25 30 

Asp Cys Ser Val Gin Arg Arg His Gin Lys He He Glu Glu Ala Pro 
35 40 45 

Val Thr He Ala Lys Ala Glu Thr Phe His Glu Met Glu Lys Ala Ala 
50 55 60 

Val Arg Leu Gly Lys Leu Val Gly Tyr Val Ser Ala Gly Thr Val Glu 
65 70 75 80 

Tyr Leu Tyr Ser His Asp Asp Gly Lys Phe Tyr Phe Leu Glu Leu Asn 
85 90 95 
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Pro Arg Leu Gin Val Glu His Pro Thr Thr Glu Met Val Ser Gly Val 
100 105 HO 

Asn Leu Pro Ala Ala Gin Leu Gin He Ala Met Gly He Pro Met His 
115 120 125 

Arg He Ser Asp He Arg Thr Leu Tyr Gly Met Asn Pro His Ser Ala 
130 135 140 

Ser Glu He Asp Phe Glu Phe Lys Thr Gin Asp Ala Thr 
145 150 155 



(2) INFORMATION POR SEQ ID NO: 35? 

(i) Sequence characteristics: 

(A) Length: 27 amino acidB 

(B) Type: Amino acid 

(C) Strandedness: Single 
(0) Topology: Linear 

(ii) Molecule type: Peptide 

<xi) Sequence Description: SEQ ID NO: 35: 

Lys Lys Gin Arg Arg Pro He Pro Lys Gly His Cys Thr Ala Cys Arg 
5 10 15 

He Thr Ser Glu Asp Pro Asn Asp Gly Phe Lys 
20 25 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) Sequence characteristics: 

(A) Length: 21 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

<xi) Sequence Description: SEQ ID NO: 36: 

Pro Ser Gly Gly Thr Leu His Glu Leu Asn Phe Arg Ser Ser Ser Asn 
5 10 15 

Val Trp Gly Tyr Phe 
20 
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(2) INFORMATION FOR SEQ ID NO: 37: 

(i) Sequence characteristics: 

(A) Length: 122 amino acids 

(B) Type: Amino acid 

(C) Strandedness : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 37: 

Ser Val Gly Asn Asn Gly Asn He His Ser Phe Ser Asp Ser Gin Phe 
5 10 

Gly His He Phe Ala Phe Gly Glu Asn Arg Gin Ala Ser Arg Lys His 
20 25 30 

Met Val Val Ala Leu Lys Glu Leu Ser He Arg Gly Asp Phe Arg* Thr 
35 40 45 

Thr Val Glu Tyr Leu He Lys Leu Leu Glu Thr Glu Asp Phe Glu Asp 
50 55 60 

Asn Thr He Thr Thr Gly Trp Leu Asp ABp Leu He Thr His Lys Met 
65 70 75 80 

Thr Ala Glu Lys Pro Asp Pro Thr Leu Ala Val He Cys Gly Ala Ala 
85 90 95 

Thr Lys Ala Phe Leu Ala Ser Glu Glu Ala Arg His Lys Tyr He Glu 
100 105 110 

Ser Leu Gin Lys Gly Gin Val Leu Ser Lys 
115 120 



<2) INFORMATION FOR SEQ ID NO: 38: 

(i) Sequence characteristics: 

(A) Length: 190 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 38: 

Asp Leu Leu Gin Thr Met Phe Pro Val Asp Phe He His Glu Gly Lys 
5 10 15 

Arg Tyr Lys Phe Thr Val Ala Lys Ser Gly Asn Asp Arg Tyr Thr Leu 
20 25 30 

Phe He Asn Gly Ser Lys Cys Asp He He Leu Arg Gin Leu Ser Asp 
35 40 45 
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Gly Gly Leu Leu He Ala He Gly Gly Lys Ser Hie Thr He Tyr Trp 
50 55 60 

Lys Glu Glu Val Ala Ala Thr Arg Leu Ser Val Asp Ser Met Thr Thr 
65 70 75 80 

Leu Leu Glu Val Glu Asn Asp Pro Thr Gin Leu Arg Thr Pro Ser Pro 
85 90 95 

Gly Lys Leu Val Lys Phe Leu Val Glu Asn Gly Glu His He He Lys 
100 105 HO 

Gly Gin Pro Tyr Ala Glu He Glu Val Met Lys Met Gin Met Pro Leu 
115 120 125 

Val Ser Gin Glu Asn Gly He Val Gin Leu Leu Lys Gin Pro Gly Ser 
130 135 140 

Thr He Val Ala Gly Asp He Met Ala He Met Thr Leu Asp Asp Pro 
145 vi- 150 155 160 

Ser Lys Val Lys His Ala Leu Pro Phe Glu Gly Met Leu Pro Asp Phe 
165 170 175 

Gly Ser Pro Val He Glu Gly Thr Lys Pro Ala Tyr Lys Phe 
180 185 190 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) Sequence characteristics: 

(A) Length: 37 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii> Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 39: 

Met Arg Phe Asn Lys He Leu He Ala Asn Arg Gly Glu He Ala Leu 
5 10 15 

Arg He Leu Arg Thr Cys Glu Glu Leu Gly He Gly Thr He Ala Val 
20 * 25 30 

His Ser Thr Val Asp 
35 
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(2) INFORMATION FOR SEQ ID NO: 40: 

<i) Sequence characteristics: 

(A) Length: 21 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 40: 

Arg Asn Ala Leu His Val Gin Leu Ala Asp Glu Ala Val Cys He Gly 
5 10 15 

Glu Ala Ala ser Ser 
20 



(2) INFORMATION FOR SEQ ID NO: 41 1 

(i) Sequence characteristics: 

(A) Length: 38 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 41: 

Lys Ser Tyr Leu Asn He Pro Asn He He Ala Ala Ala Leu Thr Arg 
5 10 15 

Asn Ala Ser Ala He His Pro Gly Tyr Gly Phe Leu Ala Glu Asn Ala 
20 25 30 

Arg Phe Ala Glu He Cys 

35 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i) Sequence characteristics: 

(A) Length: 41 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 42: 

Ala Asp His His Leu Thr Phe He Gly Pro Ser Pro Asp Ser He Arg 
5 10 15 
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Ala Met Gly Asp Lys Ser Thr Ala Lye Glu Thr Met Gin Arg Val Gly 
20 25 30 

Val Pro Thr lie Pro Gly Ser Asp Gly 
35 40 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) Sequence characteristics: 

(A) Length: 143 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 43: 

Leu Leu Thr Asp Val Asp Ser Ala Ala Lys Val Ala Ala Glu lie Gly 
5 10 15 

Tyr Pro Val Met lie Lys Ala Thr Ala Gly Gly Gly Gly Arg Gly Met 
20 25 30 

Arg Leu Val Arg Glu Pro Ala Asp Leu Glu Lys Leu Phe Leu Ala Ala 
35 40 45 

Gin Gly Glu Ala Glu Ala Ala Phe Gly Asn Pro Gly Leu Tyr Leu Glu 
50 55 60 

Lys Phe lie Asp Arg Pro Arg His Val Glu Phe Gin lie Leu Ala Asp 
65 70 75 80 

Ala Tyr Gly Asn Val Val His Leu Gly Glu Arg Asp Cys Ser lie Gin 
85 90 95 

Arg Arg~His Gin Lys Leu Leu Glu Glu Ala Pro Ser Pro Ala Leu Ser 
100 105 110 

Ala Asp Leu Arg Gin Lys Met Gly Asp Ala Ala Val Lys Val Ala Gin 
115 120 125 

Ala lie Gly Tyr lie Gly Ala Gly Thr Val Glu Phe Leu Val Asp 
130 135 140 
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(2) INFORMATION FOR SEQ ID NO: 44: 

(i) Sequence characteristics: 

(A) Length: 50 amino acidB 

(B) Type: Amino acid 

(C) Strandedness: Single 
{D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 44: 

Ala Thr Gly Asn Phe Tyr Phe Met Glu Met Asn Thr Arg He Gin Val 
5 10 15 

Glu His Pro Val Thr Glu Met He Thr Gly Leu Asp Leu He Ala Glu 
20 25 30 

Gin He Arg He Ala Gin Gly Glu Ala Leu Arg Phe Arg Gin Ala Asp 
35 40 45 

He Gin 
50 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) Sequence characteristics: 

(A) Length: 19 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 
(0 ) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 45: 

Leu Arg Gly His Ala He Glu Cys Arg He Asn .Ala Glu Asp Pro Glu 
5 10 15 

Tyr Asn Phe 
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(2) INFORMATION FOR SEQ ID NO: 46: 

(i) Sequence characteristics: 

(A) Length: 9 amino acids 

(B) Type: Amino acid 

(C) Strandedness : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 46: 

Arg Pro Asn Pro Gly Arg lie Thr Gly 



(2) INFORMATION FOR SEQ ID NO: 47: ■ 

(i) Sequence characteristics: 

(A) Length: 7 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 47: 

Pro Gly Val Arg Val Asp Ser 
5 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) Sequence characteristics: 

(A) Length: 44 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 48: 

His Val Tyr Thr Asp Tyr Glu lie Pro Pro Tyr Tyr Asp Ser Leu lie " 
5 10 15 

Gly Lys Leu He Val Trp Gly Ala Thr Arg Glu Glu Ala He Ala Arg 
20 25 30 

Met Gin Arg Ala Leu Arg Glu Cys Ala He Thr Gly 
35 40 
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(2) INFORMATION FOR SEQ ID NO; 49: 

(i) Sequence characteristics: 

(A) Length: - 38 amino acids 

(B) Type: Amino acid 
(C> Strandedness : Single 

(D) Topology: Linear 

(ii> Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 49: 

Leu Pro Thr Thr Leu Ser Phe His Gin Leu Met Leu Gin Met Pro Glu 
5 10 15 

Phe Leu Arg Gly Glu Leu Tyr Thr Asn Phe Val Glu Gin Val Met Leu 
20 25 30 

Pro Arg lie Leu Lys Ser 
35 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) Sequence characteristics: 

(A) Length: 37 amino acids 

( B ) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

<ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 50: 

Met Lys Phe Asp Lys lie Leu He Ala Asn Arg Gly Glu He Ala Leu 
5 10 15 

Arg lie Leu Arg Ala Cys Glu Glu Met Gly He Ala Thr He Ala Val 
20 25 30 

His Ser Thr Val Asp 
35 
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(2) INFORMATION FOR SEQ ZD NO: 51: 

(i) Sequence characteristics: 

(A) Length:- 21 amino acids 

(B) Type: Amino acid 

( C ) Str andedness : S ing le 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 51: 



Arg Asn Ala Leu His ValGln Leu Ala Asp Glu Ala Val Cys lie Gly 
5 10 15 

Glu Pro Ala Ser Ala 
20 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) Sequence characteristics: 

(A) Length: 38 amino acids 

(B) Type: Amino acid 

(C) Str andedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 52: 

Lvs Ser Tyr Leu Asn lie Pro Asn lie lie Ala Ala Ala Leu Thr Arg 
1 5 10 15 

Asn Ala Ser Ala He His Pro Gly Tyr Gly Phe Leu Ser Glu Asn Ala 
20 25 30 

Lys Phe Ala Glu He Cys 
35 
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(2) INFORMATION FOR SEQ ID NO: 53: 

(i) Sequence characteristics: 

(A) Length: 42 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: S3: 

Ala Asp. His His lie Ala Phe lie Gly Pro Thr Pro Glu Ala lie Arg 
5 10 15 

Leu Met Gly Asp Lys Ser Thr Ala Lys Glu Thr Met Gin Lys Ala Gly 
20 25 30 

Val Pro Thr Val Pro Gly Ser Glu Gly Leu 
35 40 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) Sequence characteristics: 

(A) Length: 142 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 54: 

Val Glu Thr Glu Gin Glu Gly Leu Glu Leu Ala Lys Asp lie Gly Tyr 
5 10 15 

Pro Val Met He Lys Ala Thr Ala Gly Gly Gly Gly Arg Gly Met Arg 
20 25 30 

Leu Val Arg Ser Pro Asp Glu Phe Val Lys Leu Phe Leu Ala Ala Gin 
35 40 45 

Gly Glu Ala Gly Ala Ala Phe Gly Asn Ala Gly Val Tyr He Glu Lys 
50 55 60 

Phe He Glu Arg Pro Arg His He Glu Phe Gin He Leu Ala Asp Asn 
65 70 75 80 

Tyr Gly Asn Val He His Leu Gly Glu Arg Asp Cys Ser He Gin Arg 
85 90 95 

Arg Asn Gin Lys Leu Leu Glu Glu Ala Pro Ser Pro Ala Leu Asp Ser 
100 105 110 
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Asp Leu Arg Glu Lye Met Gly Gin Ala Ala Val Lys Ala Ala Gin Phe 

115 120 125 

He Asn Tyr Ala Gly Ala Gly Thr He Glu Phe Leu Leu Asp 

130 135 140 



(2) INFORMATION FOR SEQ ID NO: 55: 

<i) Sequence characteristics: 

(A) Length: 50 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 55: 

Arg Ser Gly Gin Phe Tyr Phe Met Glu Met Asn Thr Arg He Gin Val 
5 10 IS 

Glu His Pro Val Thr Glu Met Val Thr Gly Val Asp Leu Leu Val Glu 
20 25 30 

Gin He Arg He Ala Gin Gly Glu Arg Leu Arg Leu Thr Gin Asp Gin 
35 40 45 

Val Val 
50 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) Sequence characteristics: 

(A) Length: 19 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 56: 

Leu Arg Gly His Ala He Glu Cys Arg He Asn Ala Glu Asp Pro Asp 
5 10 15 

His Asp Phe 
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(2) INFORMATION FOR SEQ ID MO: 57: 

(i) Sequence characteristics: 

(A) Length: 9 amino acids 

(B) Type: Amino acid 

(C) StrandednesB: Single 

(D ) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 57 

Arg Pro Ala Pro Gly Arg lie Ser Gly 
5 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) Sequence characteristics: 

(A) Length: 6 amino acids 

(B) Type: Amino acid 

(C) StrandednesB : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 58 

Tyr Leu Pro Pro Gly Gly 
5 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) Sequence characteristics: 

(A) Length: 7 amino acids 

(B) Type: Amino acid 

(C) strandedness : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 59 

Pro Gly Val Arg He Asp Ser 
5 
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(2) INFORMATION FOR SEQ ID NO : 60 : 

(i) Sequence characteristics: 

(A) Length: 44 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

<ii) Molecule type: Peptide 

(xi) sequence Description: SEQ ID NO: 60: 

His Val Tyr Thr Asp Tyr Gin He Pro Pro Tyr Tyr Asp Ser Leu He 
5 10 15 

Gly Lys Leu He Val Trp Gly Pro Asp Arg Ala Thr Ala He Asn Arg 
20 25 30 

Met Lys Arg Ala Leu. Arg Glu Cys Ala He Thr Gly 
35 40 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) Sequence characteristics: 

(A) Length: 154 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 61: 

Leu Pro Thr Thr He Gly Phe His Gin Arg He Met Glu Asn Pro Gin 
5 10 15 

Phe Leu Gin Gly Asn Val Ser Thr Ser Phe Val Gin Glu Met Asn LyB 
20 25 30 

Pro Leu Asp Phe Asn Glu He Arg Gin Leu Leu Thr Thr He Ala Gin 
35 40 45 

Thr Asp He Ala Glu Val Thr Leu Lys Ser Asp Asp Phe Glu Leu Thr 
50 55 60 

Val Arg Lys Ala Val Gly Val Asn Asn Ser Val Val Pro Val Val Thr 
65 50 75 80 

Ala Pro Leu Ser Gly Val Val Gly Ser Gly Leu Pro Ser Ala He Pro 
85 90 95 

He Val Ala His Ala Ala Pro Ser Pro Ser Pro Glu Pro Gly Thr Ser 
100 105 110 

Arg Ala Ala Asp His Ala Val Thr Ser Ser Gly Ser Gin Pro Gly Ala 
115 120 125 
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Lys lie lie Asp Gin Lys Leu Ala Glu Val Ala ser Pro Met Val Gly 
130 135 140 

Thr Phe Tyr Arg Ala Pro Ala Pro Gly Glu 
145 150 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) Sequence characteristics: 

(A) Length: 24 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 62: 

Ala Val Phe Val Glu Val Gly Asp Arg lie Arg Gin Gly Gin Thr Val 
5 10 15 

Cys He He Glu Ala Met Lys Met 
20 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) Sequence characteristics: 

(A) Length: 36 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 63: 

Met Leu Asp Lys He Val He Ala Asn Arg Gly Glu He Ala Leu Arg 
S 10 15 

He Leu Arg Ala Cys Lys Glu Leu Gly He Lys Thr Val Ala Val His 
20 25 30 

Ser Ser Ala Asp 
35 
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(2) INFORMATION FOR SEQ ID NO: 64: 

(i) Sequence characteristics: 

(A) Length: 21 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 64: 

Arq Asp Leu Lys His Val Leu Leu Ala Asp Glu Thr Val Cys lie Gly 
5 10 15 

Pro Ala Pro Ser Val 
20 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) Sequence characteristics: 

(A) Length: 38 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 65: 

Lys Ser Tyr Leu Asn He Pro Ala He He Ser Ala Ala Glu He Thr 
5 10 15 

Gly Ala Val Ala He His Pro Gly Tyr Gly Phe Leu Ser Glu Asn Ala 
20 25 30 

Asn Phe Ala Glu Gin Val 
35 



(2) INFORMATION FOR SEQ ID NO* 66: 

(i) Sequence characteristics: 

(A) Length: 43 amino acids 

( B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 66: 

Glu Arg Ser Gly Phe He Phe He Gly Pro Lys Ala Glu Thr He Arg 
5 10 15 
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Leu Met Gly Asp Lys Val Ser Ala lie Ala Ala Met Lys Lye Ala Gly 
20 25 30 

Val Pro Cye Val Pro Gly Ser Asp Gly Pro Leu 
35 40 



(2) INFORMATION FOR SEQ. ID NO; 67: 

(1) Sequence characteristics: 

(A) Length: 141 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

( D ) Topo 1 ogy : Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 67: 

Gly Asp Asp Met Asp Lys Asn Arg Ala He Ala Lys Arg He Gly Tyr 
5 10 15 

Pro Val He He Lys Ala Ser Gly Gly Gly Gly Gly Arg Gly Met Arg 
20 25 30 

Val Val Arg Gly Asp Ala Glu Leu Ala Gin Ser He Ser Met Thr Arg 
35 40 45 

Ala Glu Ala Lys Ala Ala Phe Ser Asn Asp Met Val Tyr Met Glu Lys 
50 55 60 

Tyr Leu Glu Asn Pro Arg His Val Glu He Gin Val Leu Ala Asp Gly 
65 70 75 80 

Gin Gly Asn Ala He Tyr Leu Ala Glu Arg Asp Cys Ser Met Gin Arg 
85 90 95 

Arg His Gin Lys Val Val Glu Glu Ala Pro Ala Pro Gly He Thr Pro 
100 105 HO 

Glu Leu Arg Arg Tyr lie Gly Glu Arg Cys Ala Lys Ala Cys Val Asp 
115 120 125 

He Gly Tyr Arg Gly Ala Gly Thr Phe Glu Phe Leu Phe 
130 135 140 
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(2) INFORMATION FOR SEQ ID NO: 68: 

<i) Sequence characteristics: 

(A) Length:. 50 amino acids 

(B) Type: Amino acid 
{ C ) strandedneflB : Single 

(D) Topology: Linear 

(ii) Molecule type: peptide 

(xi) Sequence Description: SEQ ID NO: 68: 

Glu Asn Gly Glu Phe Tyr Phe lie Glu Met Asn Thr Arg He Gin Val 
5 10 15 

Glu His Pro Val Thr Glu Met He Thr Gly Val Asp Leu He Lys Glu 
20 25 30 

Gin Met Arg He Ala Ala Gly Gin Pro Leu Ser He Lys Gin Glu Glu 
35 40 45 

Val His 
50 



(2) INFORMATION FOR SEQ ID NO: 69 x 

(i) Sequence characteristics: 

(A) Length: 25 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 69: 

Val Arg Gly His Ala Val Glu Cys Arg He Asn Ala Glu Asp Pro Asn 
5 10 15 

Leu Pro Ser Pro Gly Lys He Thr Arg 
20 25 
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(2) INFORMATION POR SEQ ID NO: 70: 

(i) Sequence characteristics: 

(A) Length: 6 amino acids 

(B) Type: Amino acid 

(C) Strandedness : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 70: 

Phe His Ala Pro Gly Gly 
5 



(2) INFORMATION FOR SEQ ID NO: 71: 

(i) Sequence characteristics: 

(A) Length: 7 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 71: 

Phe Gly Val Arg Trp Glu Ser 
5 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) Sequence characteristics: 

(A) Length: 44 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 72: 

His lie Tyr Ala Gly Tyr Thr Val Pro Pro Tyr Tyr Asp Ser Met lie 
5 10 15 

Gly Lys Leu lie Cys Tyr Gly Glu Asn Arg Asp Val Ala lie Ala Arg 
20 25 30 

Met Lys Asn Ala Leu Gin Glu Leu He He Asp Gly 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 73 : 

(i) Sequence characteristics; 

(A) Length: 135 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 73: 

lie Lys Thr Asn Val Asp Leu Gin lie Arg lie Met Asn Asp Glu Asn 
5 10 15 

Phe Gin His Gly Gly Thr Asn He His Tyr Leu Glu Lys Lys Leu Gly 
20 25 30 

Leu Gin Glu Lys Met Asp He Arg Lys He Lys Lys Leu He Glu Leu 
35 40 45 

Val Glu Glu Ser Gly He Ser Glu Leu Glu He Ser Glu Gly Glu Glu 
50 55 60 

Ser Val Arg He Ser Arg Ala Ala Pro Ala Ala Ser Phe Pro Val Met 
65 70 75 80 

Gin Gin Ala Tyr Ala Ala Pro Met Met Gin Gin Pro Ala Gin Ser Asn 
85 90 95 

Ala Ala Ala Pro Ala Thr Val Pro Ser Met Glu Ala Pro Ala Ala Ala 
100 105 HO 

Glu He Ser Gly His He Val Arg Ser Pro Met Val Gly Thr Phe Tyr 
115 120 125 

Arg Thr Pro Ser Pro Asp Ala 
130 135 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) Sequence characteristics: 

(A) Length: 57 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 
(D> Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 74: 

Lys Ala Phe He Glu Val Gly Gin Lys Val Asn Val Gly Asp Thr Leu 
5 10 15 

Cys He Val Glu Ala Met Lys Met Met Asn Gin He Glu Ala Asp Lys 
20 25 30 
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Ser Gly Thr Val Lys Ala lie Leu Val Glu Ser Gly Gin Pro Val Glu 
35 40 45 

Phe Asp Glu Pro Leu Val Val He Glu 
50 55 



(2) INFORMATION FOR SEQ ID NO: 75: 

(1) Sequence characteristics: 

(A) Length: 72 amino acids 

( B ) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 75: r .. 

Met Leu Ser Ala Ala Leu Arg Thr Leu Lys His Val Leu Tyr Tyr Ser 
5 10 15 

Arg Gin Cys Leu Met Val Ser Arg Asn Leu Gly Ser Val Gly Tyr Asp 
20 25 30 

Pro Asn Glu Lys Thr Phe Asp Lys He Leu Val Ala Asn Arg Gly Glu 
35 40 45 

He Ala Cys Arg Val He Arg Thr Cys Lys Lys Met Gly He Lys Thr 
50 55 60 

Val Ala He His Ser Asp Val Asp 
65 70 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) Sequence characteristics: 

(A) Length: 21 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 
(D > Topology: Linear 

(ii) Molecule type: Peptide 

<xi) Sequence Description: SEQ ID NO: 76: 

Ala Ser Ser Val His Val Lys Met Ala Asp Glu Ala Val Cys Val Gly 
5 10 15 

Pro Ala Pro Thr Ser 
20 
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(2) INFORMATION FOR SEQ ID NO I 77 x 

(i) Sequence characteristics: 

(A) Length: 38 amino acids 

(B) Type: Amino acid 

(C) Strandedness : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 77: 

Lys Ser Tyr Leu Asn Met Asp Ala lie Met Glu Ala He Lys Lys Thr 
5 10 15 

Arg Ala Gin Ala Val His Pro Gly Tyr Gly Phe Leu Ser Glu Asn Lys 
20 25 30 

Glu Phe Ala Arg Cys Leu 
35 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) Sequence characteristics: 

<A) Length: 41 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 
<D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 78: 

Ala Ala Glu Asp Val Val Phe He Gly Pro Asp Thr His Ala He Gin 
5 10 15 

Ala Met Gly Asp Lys He Glu Ser Lys Leu Leu Ala Lys Lys Ala Glu 
20 25 30 

Val Asn Thr He Pro Gly Phe Asp Gly 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 79* 

(1) Sequence characteristics: 

(A) Length: 144 amino acids 

(B) Type: Amino acid 

(C) Strandedness : Single 

(D) Topology: Linear 

(ii> Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 79: 

Val Lys Asp Ala Glu Glu Ala Val Arg lie Ala Arg Glu He Gly Tyr 
5 10 15 

Pro Val Met He Lys Ala Ser Ala Gly Gly Gly Gly Lye Gly Met Arg 
20 25 30 

He Ala Trp Asp Asp Glu Glu Thr Arg Asp Gly Phe Arg Leu Ser Ser" 
35 40 45 

Gin Glu Ala Ala Ser Ser Phe Gly Asp Asp Arg Leu Leu He Glu Lys 
50 55 60 

Phe He Asp Asn Pro Arg His He Glu He Gin Val Leu Gly Asp Lys 
65 70 75 80 

His Gly Asn Ala Leu Trp Leu Asn Glu Arg Glu Cys Ser He Gin Arg 
85 90 95 

Arg Asn Gin Lye Val Val Glu Glu Ala Pro ser He Phe Leu Asp Ala 
100 105 110 

Glu Thr Arg Arg Ala Met Gly Glu Gin Ala Val Ala Leu Ala Arg Ala 
115 120 125 

Val Lys Tyr Ser Ser Ala Gly Thr Val Glu Phe Leu Val Asp Ser Lys 
130 135 140 
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(2) INFORMATION FOR SEQ ID NO: 80: 

(i) Sequence characteristics t 

(A) Length: 47 amino acids 

(B) Type: Amino acid 

(C) strandednesB: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 80: 

Lye Asn Phe Tyr Phe Leu Glu Met Asn Thr Arg Leu Gin Val Glu His 
5 10 15 

Pro Val Thr Glu Cys He His Trp Pro Gly Pro Ser Pro Gly Lys Thr 
20 25 30 

Val Leu Gin Glu His Leu Ser Gly ThrcAsn Lys Leu He Phe Ala 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 81: 

(i) Sequence characteristics: 

(A) Length: 29 amino acids 

(B) Type: Amino acid 

(C) Strandedness : single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 81: 

Phe Asn Gly Trp Ala Val Glu Cys Arg Val Tyr Ala Glu Asp Pro Tyr 
5 10 15 

Lys Ser Phe Gly Leu Pro Ser He Gly Arg Leu Ser Gin 
20 25 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) Sequence characteristics: 

(A) Length: 14 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 82: 

Tyr Gin Glu Pro Leu His Leu Pro Gly Val Arg Val Asp Ser 
5 10 
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<2) INFORMATION FOR SEQ ID NO: 83: 

<i) Sequence characteristics: 

(A) Length: 44 amino acids 

(8) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii> Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 83: 

Gly He Gin Pro Gly Ser Asp He Ser He Tyr Tyr Asp Pro Met He 
5 10 15 

Ser Lys Leu He Thr Tyr Gly Ser Asp Arg Thr Glu Ala Leu Lys Arg 
20 25 30 

Met Ala Asp Ala Leu Asp Asn Tyr* -Val He Arg Gly 
35 40 



(2) INFORMATION FOR SEQ ID NO: 84: 

(i) sequence characteristics: 

(A) Length: 251 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 84: 

Val Thr His Asn He Ala Leu Leu Arg Glu Val He He Asn Ser Arg 
S 10 15 

Phe Val Lys Gly Asp He Ser Thr Lys Phe Leu Ser Asp Val Tyr Pro 
20 25 30 

Asp Gly Phe Lys Gly Hie Met Leu Thr Lye Ser Glu Lys Asn Gin Leu 
35 40 45 

Leu Ala He Ala Ser Ser Leu Phe Val Ala Phe Gin Leu Arg Ala Gin 
50 55 60 

His Phe Gin Glu Asn Ser Arg Met Pro Val He Lys Pro Asp He Ala 
65 70 75 80 

Asn Trp Glu Leu Ser Val Lys Leu His Asp Lys Val His Thr Val Val 
85 90 95 

Ala Ser Asn Aen Gly Ser Val Phe Ser Val Glu Val Asp Gly Ser Lys 
100 105 HO 

Leu Asn Val Thr Ser Thr Trp Asn Leu Ala Ser Pro Leu Leu Ser Val 
115 120 125 
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Ser Val Asp Gly Thr Gin Arg Thr Val Gin Cys Leu Ser Arg Glu Ala 
130 135 140 

Gly Gly Asn Met Ser lie Gin Phe Leu Gly Thr Val Tyr Lya Val Asn 
145 150 155 160 

He Leu Thr Arg Leu Ala Ala Glu Leu Asn Lys Phe Met Leu Glu Lys 
165 170 175 

Val Thr Glu Asp Thr Ser Ser Val Leu Arg Ser Pro Met Pro Gly Val 
180 185 190 

Val Val Ala Val Ser Val Lys Pro Gly Asp Ala Val Ala Glu Gly Gin 
195 200 205 

Glu He Cys Val He Glu Ala Met Lys Met Gin Asn Ser Met Thr Ala 
210 215 220 

Gly Lys Thr Gly Thr Val Lys Ser Val His Cys Gin Ala Gly Asp Thr 
225 ' 230 235 240 

Val Gly Glu Gly Asp Leu Leu Val Glu Leu Glu 
245 250 



(2) INFORMATION FOR SEQ ID NO: 85: 

(i) Sequence characteristics: 

(A) Length: 90 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

<ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 85: 

Met Pro Tyr Arg Glu Arg Phe Cys Ala He Arg Trp Cys Arg Asn Ser 
5 10 15 

Gly Arg Ser Ser Gin Gin Leu Leu Trp Thr Leu Lys Arg Ala Pro Val 
20 25 30 

Tyr Ser Gin Gin Cys Leu Val Val Ser Arg Ser Leu Ser Ser Val Glu 
35 40 45 

Tyr Glu Pro Lys Glu Lys Thr Phe Asp Lys He Leu He Ala Asn Arg 
50 55 60 

Gly Glu He Ala Cys Arg Val He Lys Thr Cys Arg Lys Met Gly He 
65 70 75 80 

Arg Thr Val Ala He His Ser Asp Val Asp 
85 90 
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(2) INFORMATION FOR SEQ ID NO t 86 1 

<i) Sequence characteristics: 

(A) Length: 21 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 86: 

Ala Ser Ser Val His Val Lys Met Ala Asp Glu Ala Val Cys Val Gly 
5 10 15 

Pro Ala Pro Thr Ser 
20 



(2) INFORMATION FOR SEQ ID NO: 87: 

(i) Sequence characteristics: 

(A) Length: 38 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 87: 

Lys Ser Tyr Leu Asn Met Asp Ala lie Met Glu Ala He Lys Lys Thr 
5 10 15 

Gly Ala Gin Ala Val His Pro Gly Tyr Gly Phe Leu Ser Glu Asn Lys 
20 25 30 

Glu Phe Ala Lys Cys Leu 
35 



(2) INFORMATION FOR SEQ ID NO: 88: 

(i) Sequence characteristics: 

(A) Length: 41 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 88: 

Ala Ala Glu Asp Val Thr Phe He Gly Pro Asp Thr His Ala He Gin 
5 10 15 
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Ala Met Gly Asp Lys lie Glu Ser Lys Leu Leu Ala Lye Arg Ala Lye 
20 25 30 

Vai Asn Thr He Pro Gly Phe Asp Gly 
35 40 



(2) INFORMATION FOR SEQ ZD NO: 89 1 

(i) Sequence characteristics: 

<A) Length: 144 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 89: ^> 

Leu Lys Asp Ala Asp Glu Ala Val Arg He Ala Arg Glu He Gly Tyr 
5 10 15 

Pro Val Met He Lys Ala Ser Ala Gly Gly Gly Gly Lys Gly Met Arg 
20 25 30 

He Pro Trp Asp Asp Glu Glu Thr Arg Asp Gly Phe Arg Phe Ser Ser 
35 40 45 

Gin Glu Ala Ala Ser Ser Phe Gly Asp Asp Arg Leu Leu He Glu Lys 
50 55 60 

Phe He Asp Asn Pro Arg His He Glu He Gin Val Leu Gly Asp Lys 
65 70 75 80 

His Gly Asn Ala Leu Trp Leu Asn Glu Arg Glu Cys Ser He Gin Arg 
85 90 95 

Arg Asn Gin Lys Val Val Glu Glu Ala Pro Ser He Phe Leu Asp Pro 
100 105 110 

Glu Thr Arg Arg Ala Met Gly Glu Gin Ala Val Ala- Trp Pro Lys Ala 
115 120 125 

Val LyB Tyr Ser Ser Ala Gly Thr Val Glu Phe Leu Val Asp Ser Gin 
130 135 140 
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(2) INFORMATION FOR SEQ ID NO : 90 J 

(i) Sequence characteristics: 

(A) Length: 48 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 90: 

Lys Asn Phe Tyr Phe Leu Glu Met Asn Thr Arg Leu Gin Val Glu His 
5 10 15 

Pro Val Thr Glu Cys lie Thr Gly Leu Asp Leu Val Gin Glu Met lie 
20 25 30 

Leu Val Ala Lys Gly Tyr Pro Leu Arg His Lys Gin Glu Asp lie Pro 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) Sequence characteristics: 

(A) Length: 29 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 91: 

lie Ser Gly Trp Ala Val Glu Cys Arg Val Tyr Ala Glu Asp Pro Tyr 
5 10 15 

Lys Ser Phe Gly Leu Pro Ser lie Gly Arg Leu Ser Gin 
20 25 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) Sequence characteristics: 

(A) Length: 14 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 92: 

Tyr Gin Glu Pro lie His Leu Pro Gly Val Arg Val Asp Ser 
5 10 
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(2) INFORMATION FOR SEQ ID NO: 93: 

(i) sequence characteristics: 

(A) Length: 44 amino acids 

( B ) Type : Amino acid 

(C) strandedness: Single 

(D) Topology: Linear 

<ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 93: 

Gly He Gin Pro Gly Ser Asp He Ser He Tyr His Asp Pro Met He 
S 10 15 

Ser Lys Leu Val Thr Tyr Gly Ser Asp Arg Ala Glu Ala Leu Lye Arg 
20 25 30 

Met Glu Asp Ala Leu Asp Ser Tyr Val He Arg Gly 
35 40 



(2) INFORMATION FOR SEQ ID NO: 94* 

(i) Sequence characteristics: 

(A) Length: 251 amino acids 

(B) Type: Amino acid 

(C) strandedness: single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

<xi) Sequence Description: SEQ ID NO: 94: 

Val Thr His Asn He Pro Leu Leu Arg Glu Val He He Asn Thr Arg 
5 10 15 

Phe Val Lys Gly Asp He Ser Thr Lys Phe Leu Ser Asp Val Tyr Pro 
20 * 25 30 

Asp Gly Phe Lys Gly His Met Leu Thr Pro Ser Glu Arg Asp Gin Leu 
35 40 45 

Leu Ala He Ala Ser Ser Leu Phe Val Ala Ser Gin Leu Arg Ala Gin 
50 55 60 

Arg Phe Gin Glu His Ser Arg Val Pro Val He Arg Pro Asp Val Ala 
65 70 75 80 

Lvs Trp Glu Leu Ser Val Lys Leu His Asp Glu Asp His Thr Val Val 
85 90 95 

i 

Ala Ser Asn Asn Gly Pro Thr Phe Asn Val Glu Val Asp Gly Ser Lys 
100 105 HO 

Leu Asn Val Thr Ser Thr Trp Asn Leu Ala Ser Pro Leu Leu Ser Val 
115 120 125 
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Asn Val Asp Gly Thr Gin Arg Thr Val Gin Cys Leu Ser Pro Aep Ala 
130 135 140 

Gly Gly Asn Met Ser He Gin Phe Leu Gly Thr Val Tyr Lys Val His 
145 150 155 160 

He Leu Thr Lys Leu Ala Ala Glu Leu Asn Lys Phe Met Leu Glu Lys 
165 170 175 

Val Pro Lys Asp Thr Ser Ser Val Leu Arg Ser Pro Lys Pro Gly Val 
180 185 190 

Val Val Ala Val Ser Val Lys Pro Gly Asp Met Val Ala Glu Gly Gin 
195 200 205 

Glu He Cys Val He Glu Ala Met Lys Met Gin Asn Ser Met Thr Ala 
210 215 220 

Gly Lys Met Gly Lys Val Lys Leu Val His Cys Lys Ala Gly Asp Thr 
225 230 235 240 

Val Gly Glu Gly Asp Leu Leu Val Glu Leu Glu 
245 250 



(2) INFORMATION FOR SEQ ID NOx95t 

(i) Sequence characteristics: 

(A) Length: 17 amino acids 

( B ) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 95: 

Gin Arg Lys Phe Ala Gly Leu Arg Asp Asn Phe Asn Leu Leu Gly Glu 
5 10 15 

Lys 
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(2) INFORMATION FOR SEQ XD NO: 96: 

(i) Sequence characteristics: 

(A) Length: 34 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

( D ) Topo logy : Linear 

<ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 96: 

Asn Lys lie Leu Val Ala Asn Arg Gly Glu He Pro He Arg He Phe 
5 10 15 

Arg Thr Ala His Glu Leu Ser Met Gin Thr Val Ala He Tyr Ser His 
20 25 30 

Glu Asp 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) Sequence characteristics: 

(A) Length: 24 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

(D ) Topology: Linear 

<ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 97: 

Arg Leu Ser Thr His Lys Gin Lys Ala Asp Glu Ala Tyr Val He Gly 
5 10 15 

Glu Val Gly Gin Tyr Thr Pro Val 
20 
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(2) INFORMATION POR SEQ ID NOt98: 

(i) Sequence characteristics: 

(A) Length: 38 amino acids 

(B) Type: Amino acid 

(C) strandedness: single 

(D) Topology: Linear 

(ii) Molecule types Peptide 

(xi) Sequence Description: SEQ ID NO: 98: 

Gly Ala Tyr Leu Ala lie Asp Glu He He Ser He Ala Gin Lys His 
S 10 15 

Gin val Asp Phe He His Pro Gly Tyr Gly Phe Leu Ser Glu Asn Ser 
20 25 30 

Glu Phe Ala Asp Lys Val 
35 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) Sequence characteristics: 

(A) Length: 41 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

<ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 99: 

Val Lys Ala Gly He Thr Trp He Gly Pro Pro Ala Glu Val He Asp 
5 10 15 

Ser Val Gly Asp Lys Val Ser Ala Arg Asn Leu Ala Ala Lys Ala Asn 
20 25 30 

Val Pro Thr Val Pro Gly Thr Pro Gly 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 100: 

(i) Sequence characteristics: 

(A) Length: 144 amino acids 

(B) Type: Amino acid 

(C) Strandedness : Single 

(D) Topology: Linear 

<ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 100: 

lie Glu Thr Val Glu Glu Ala Leu Asp Phe Val Asn Glu Tyr Gly Tyr 
5 10 15 

Pro Val lie He Lys Ala Ala Phe Gly Gly Gly Gly Arg Gly Met Arg 
20 25 30 

Val Val Arg Glu Gly Asp Asp Val Ala Asp Ala Phe Gin Arg Ala Thr 
35 40 45 

Ser Glu Ala Arg Thr Ala Phe Gly Asn Gly Thr Cys Phe Val Glu Arg 
50 55 60 

Phe Leu Asp Lys Pro Lys His He Glu Val Gin Leu Leu Ala Asp Asn 
65 70 75 80 

His Gly Asn Val Val His Leu Phe Glu Arg Asp Cys Ser Val Gin Arg 
85 90 95 

Arg His Gin Lys Val Val Glu Val Ala Pro Ala Lys Thr Leu Pro Arg 
100 105 110 

Glu Val Arg Asp Ala He Leu Thr Asp Ala Val Lys Leu Ala Lys Glu 
115 120 125 

Cys Gly Tyr Arg Asn Ala Gly Thr Ala Glu Phe Leu Val Asp Asn Gin 
130 135 140 
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(2) INFORMATION POR SEQ ID NO: 101: 

(i) Sequence characteristics: 

(A) Length: 51 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 101: 

c 

Asn Arg His Tyr Phe lie Glu lie Aen Pro Arg lie Gin Val Glu His 
5 10 15 

Thr He Thr Glu Glu He Thr Gly He Asp He Val Ala Ala Gin He 
20 25 30 

Gin He Ala Ala Gly Ala Ser Leu Pro Gin Leu Gly Leu Phe Gin Asp 
35 40 45 

Lys He Thr 
50 



(2) INFORMATION FOR SEQ ID NO: 102: 

<i) Sequence characteristics: 

(A) Length: 20 amino acids 

(B) Type: Amino acid 

(C) strandedness: Single 

( D ) Topology : Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 102: 

Thr Arg Gly Phe Ala He Gin Cys Arg He Thr Thr Glu Asp Pro Ala 
5 10 15 

Lys Asn Phe Gin 
20 
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(2) INFORMATION FOR SEQ ID NO: 103: 

(i) Sequence characteristics: 

(A) Length: 14 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 103: 

Pro Asp Thr Gly Arg lie Glu Val Tyr Arg Ser Ala Gly Gly 
5 10 



(2) INFORMATION FOR SEQ ID NO: 104: 

(i) Sequence characteristics: 

(A) Length: S2 amino acids 

( B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 104: 

Asn Gly Val Arg Leu Asp Gly Gly Asn Ala Tyr Ala Gly Thr He He 
5 10 15 

ser Pro His Tyr Asp Ser Met Leu Val Lys Cys Ser Cys Ser Gly Ser 
20 30 

Thr Tyr Glu He Val Arg Arg Lys Met He Arg Ala Leu He Glu Phe 
35 40 45 

Arg He Arg Gly 
50 
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(2) INFORMATION FOR SEQ ID NO: 105 t 

(i) Sequence characteristics: 

(A) Length: 257 amino acids 

(B) Type: Amino acid 

(C) Strandedneee: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 105: 

Val Lys Thr Asn lie Pro Phe Leu Leu Thr Leu Leu Thr Asn Pro Val 
5 10 15 

Phe He Glu Gly Thr Tyr Trp Gly Thr Phe He Asp Asp Thr Pro Gin 
20 25 30 

Leu Phe Gin Met Val Ser Ser Gin Asn Arg Ala Gin Lys Leu Leu His 
35 40 45 

Tyr Leu Ala Asp Val Ala Asp Asn Gly Ser Ser He Lys Gly Gin He 
50 55 60 

Gly Leu Pro Lys Leu Lys Ser Asn Pro Ser Val Pro His Ser Tyr Asn 
65 70 75 80 

Met Tyr Pro Arg Val Tyr Glu Asp Phe Gin Lys Met Arg Glu Thr Tyr 
85 90 95 

Gly Asp Leu Ser Val Leu Pro Thr Arg Ser Phe Leu Ser Pro Leu Glu 
100 105 110 

Thr Asp Glu Glu He Glu Val Val He Glu Gin Gly Lys Thr Leu He 
115 120 125 

He Lys Leu Gin Ala Val Gly Asp Leu Asn Lys Lys Thr Gly Glu Arg 
130 135 140 

Glu Val Tyr Phe Asp Leu Asn Gly Glu Met Arg Lys He Arg Val Ala 
145 150 155 160 

Asp Arg Ser Gin Lys Val Glu Thr Val Thr Lys Ser Lys Ala Asp Met 
165 170 175 

His Asp Pro Leu His He Gly Ala Pro Met Ala Gly Val He Val Glu 
180 185 190 

Val Lys Val His Lys Gly Ser Leu He Lys Lys Gly Gin Pro Val Ala 
19S 200 205 

Val Leu Ser Ala Met Lys Met Glu Met He He Ser Ser Pro Ser Asp 
210 215 220 

Gly Gin Val Lys Glu Val Phe Val Ser Asp Gly Glu Asn Val Asp Ser 
225 230 235 240 

Ser Asp Leu Leu Val Leu Leu Glu Asp Gin Val Pro Val Glu Thr Lys 
245 250 255 

Ala 
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(2) INFORMATION FOR SEQ ID NO: 106: 
(i) Sequence characteristics: 

(A) Length: 165 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii> Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 106 : 

Val Leu Thr Val Ala Leu Phe Pro Gin Pro Gly Leu Lys Phe Leu Glu 
5 10 15 

Asn Arg His Asn Pro Ala Ala Phe Glu Pro Val Pro Gin Ala Glu Ala 
20 25 30 

Ala Gin Pro Val Ala Lys Ala Glu Lys Pro Ala Ala Ser Gly Val Tyr 
35 40 45 

Thr Val Glu Val Glu Gly Lys Ala Phe Val Val Lys Val Ser Asp Gly 
50 55 60 

Gly Asp Val Ser Gin Leu Thr Ala Ala Ala Pro Ala Pro Ala Pro Ala 
65 70 75 80 

Pro Ala Pro Ala Ser Ala Pro Ala Ala Ala Ala Pro Ala Gly Ala Gly 
B5 90 95 

Thr Pro Val Thr Ala Pro Leu Ala Gly Thr He Trp Lys Val Leu Ala 
100 105 110 

Ser Glu Gly Gin Thr Val Ala Ala Gly Glu Val Leu Leu He Leu Glu 
115 120 125 

Ala Met Lys Met Glu Thr Glu He Arg Ala Ala Gin Ala Gly Thr Val 
130 135 140 

Arg Gly lie Ala Val Lys Ala Gly Asp Ala Val Ala Val Gly Asp Thr 
145 150 155 160 

Leu Met Thr Leu Ala - 
165 
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(2) INFORMATION FOR SEQ ID NO: 107: 
(i> Sequence characteristics : 

(A) Length: 123 amino acids 

(B) Type: Amino acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 107:. 

Met Lys Leu Lys Val Thr Val Asn Gly Thr Ala Tyr Asp Val Asp Val 
5 10 15 

Asp Val Asp Lys Ser His Glu Asn Pro Met Gly Thr lie Leu Phe Gly 
20 25 30 

Gly Gly Thr Gly Gly Ala Pro Ala Pro Arg Ala Ala Gly Gly Ala Gly 
35 40 45 

Ala Gly Lys Ala Gly Glu Gly Glu lie Pro Ala Pro Leu Ala Gly Thr 
50 55 60 

Val Ser Lys lie Leu Val Lys Glu Gly Asp Thr Val Lys Ala Gly Gin 
65 70 75 80 

Thr Val Leu Val Leu Glu Ala Met Lys Met Glu Thr Glu He Asn Ala 
85 90 95 

Pro Thr Asp Gly Lys Val Glu Lys Val Leu Val Lys Glu Arg Asp Ala 
100 105 110 

Val Gin Gly Gly Gin Gly Leu He Lys He Gly 
115 120 
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(2) 



INFORMATION FOR SEQ ID NO I 108: 



(i) 



Sequence characteristics: 



(A) 
(B) 
(C) 
<D) 



Length: 
Type: 

Strandedness 
Topology: 



1473 base pairs 
Nucleic acid 
Single 
Linear 



Molecule type: 



Oligonucleotide 



(xi) 



Sequence Description: SEQ ID NO: 108: 



GTGATGATCA AGGCATCATG GGGTGGGGGT GGTAAAGGAA TAAGGAAGGT ACATAATGAT 60 



GATGAGGTCA GAG CATTG TT TAAGCAAGTG CAAGGAGAAG TCCCCGGATC GCCTATATTT 120 
ATTATGAAGG TGGCATCTCA GAGTCGACAT CTAGAGGTTC AATTGCTCTG TGACAAGCAT 180 
GGCAACGTGG CAGCACTGCA CAGTCGAGAC TGTAGTGTTC AAAGAAGGCA TCAAAAGATC 240 
ATTGAGGAGG GACCAATTAC AGTTGCTCCT CCAGAAACAA TTAAAGAGCT TGAGCAGGCG 300 
GCAAGGCGAC TAG CTAAATG TGTGCAATAT CAGGGTGCTG CTACAGTGGA ATATCTGTAC 360 
AGCATGGAAA CAGGCGAATA CTATTTCCTG GAGCTTAATC CAAGGTTGCA GGTAGAACAC 420 
CCTGTGACCG AATGGATTGC TGAAATAAAC TTACCYGCAT CTCAAGTTGT AGTAGGAATG 480 
GGCATACCAC TCTACAACAT TCCAGAGATC AG ACG CTTTT ATGGAATAGA ACATGGAGGT 540 
GGCTATCAYG CTTGGAAGGA AATATCAGCT GTTGCAACTA AATTTGATYT GGACAAAGCA 600 
CAGTCTGTAA AGCCAAARGG TCATTGTGTA GCAGTTAGAG TTACTAGCGA GGATCCAGAT 660 
GATGG GTTTA AG CCT ACMAG TGGAAGAGTR GAAGAGCTGA ACTTTAAAAG TAAACCCAAT 720 
GTTTGGGCCT ATTTCTCYGT TARGTCCGGA GGTGCAATTC AYGAGTTCTC TGATTCCCAG 780 
TTTGGTCATG TTTTTGCTTY TGGGGAATCT AGGTCWTTGG CAATAGCCAA TATGGTACTT 840 
GGGTTAAAAG AGATCCAAAT TCGTGGAGAG AT ACG CACTA ATGTTGACTA CACTGTGGAT 900 



CTCTTGAATG CTGCAGAGTA CCGAGAAAAT AWGATTCACA CTGGTTGGCT AGACAGCAGA 960 
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ATAGCWATGC GYGTTAGAGC AGAGAGGCCC CCATGGTACC TTTCAGTTGT TGGTGGAGCT 1020 
CTATATGAAG CATCAAGCAG GAGCTCGAGT GTTGTAACCG ATTATGTTGG TTATCTCAGT 1080 
AAAGGTCAAA TACCACCAAA GCACATCTCT CTTGTCAAYT TGACTGTAAC ACTGAATATA 1140 
GATGGGAGCA AATATACGAT TGAGACAGTA CGAGGTGGAC CCCGTAGCTA CAAATTAAGA 1200 
ATTAATGAAT CAGAGGTTGA RGCAGAGATA CATTTCCTGC GAGATGGCGG ACYCTTAATG 1260 
CAGTYGGATG GAAACAGTCA TGTAATTTAC GCCGAGACAG AAGCTKCTGG CACGCGCCTT 1320 
CTAATCAATG GGAGAACATG CTTATTACAG AAAGAGCAYG ATCCTTCCAG GTTGTTGGCT 1380 
GATACACCRT GCAARCTTCT TCGGTTTTTG GTCGCGGATR GTTCTCATGT GGTTGCTGAT 1440 
ACGCCATATG CYGAGGTGGA GGCCATGAAA ATG 1473 



(2) INFORMATION. FOR SEQ ID NO: 109: 

(i) Sequence characteristics: 

(A) Length: 491 amino acids 

(B) Type: Amino acid 

(C) Strandedness : Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

<ix) Features 

(A) NAME/KEY: Xaa 

(B) LOCATION: 248, 267, 311, 412, 418, 422, 436, and 474 

(C) IDENTIFICATION METHOD: Xaa = any amino acid 

(xi) Sequence Description: SEQ ID NO: 109: 

Val Met He Lys Ala Ser Trp Gly Gly Gly Gly Lys Gly He Arg Lys 
5 10 15 

Val His Asn Asp Asp Glu Val Arg Ala Leu Phe Lys Gin Val Gin Gly 
20 25 30 

Glu Val Pro Gly Ser Pro He Phe He Met Lys Val Ala Ser Gin Ser 
35 40 45 

Arg His Leu Glu Val Gin Leu Leu Cys Asp Lys His Gly Asn Val Ala 
50 55 60 

Ala Leu His Ser Arg Asp Cys Ser Val Gin Arg Arg His Gin Lys He 
65 70 75 80 
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lie Glu Glu Gly Pro lie Thr Val Ala Pro Pro Glu Thr He Lys Glu 
85 90 95 

Leu Glu Gin Ala Ala Arg Arg Leu Ala Lys Cys Val Gin Tyr Gin Gly 
100 105 HO 

Ala Ala Thr Val Glu Tyr Leu Tyr Ser Met Glu Thr Gly Glu Tyr Tyr 
US 120 125 

Phe Leu Glu Leu Asn Pro Arg Leu Gin Val Glu His Pro Val Thr Glu 
130 135 140 

Trp He Ala Glu He Asn Leu Pro Ala Ser Gin Val Val Val Gly Met 
145 150 155 160 

Gly He Pro Leu Tyr Asn He Pro Glu He Arg Arg Phe Tyr Gly He 
165 170 175 

Glu Hie Gly Gly Gly Tyr His Ala Trp Lys Glu He Ser Ala Val Ala 
180 185 190 

Thr Lys Phe Asp Leu Asp Lys Ala Gin Ser Val Lys Pro Lys Gly His 
195 200 205 

Cys Val Ala Val Arg Val Thr Ser Glu Asp Pro Asp Asp Gly Phe Lys 
210 215 220 

Pro Thr Ser Gly Arg Val Glu Glu Leu Asn Phe Lys Ser Lys Pro Asn 
225 230 235 240 

Val Trp Ala Tyr Phe Ser Val Xaa Ser Gly Gly Ala He His Glu Phe 
245 250 255 

Ser Asp Ser Gin Phe Gly His Val Phe Ala Xaa Gly Glu Ser Arg Ser 
260 265 270 

Leu Ala He Ala Asn Met Val Leu Gly Leu Lys Glu He Gin He Arg 
275 280 285 

Gly Glu He Arg Thr Asn Val Asp Tyr Thr Val Asp Leu Leu Asn Ala 
290 295 300 

Ala Glu Tyr Arg Glu Asn Xaa He His Thr Gly Trp Leu Asp Ser Arg 
305 310 315 320 

He Ala Met Arg Val Arg Ala Glu Arg Pro Pro Trp Tyr Leu Ser Val 
325 330 335 

Val Gly Gly Ala Leu Tyr Glu Ala Ser Ser Arg Ser Ser Ser Val Val 
340 345 350 

Thr Asp Tvr Val Glv Tyr Leu Ser Lys Gly Gin He Pro Pro Lys His 
355 " 360 365 

He Ser Leu Val Asn Leu Thr Val Thr Leu Asn He Asp Gly Ser Lys 
370 375 380 

Tyr Thr He Glu Thr Val Arg Gly Gly Pro Arg Ser Tyr Lys Leu Arg 
385 390 395 400 

He Asn Glu Ser Glu Val Glu Ala Glu He His Xaa Leu Arg Asp Gly 
405 410 415 
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Gly Xaa Leu Met Gin Xaa Asp Gly Asn Ser His Val He Tyr Ala Glu 
420 425 430 

Thr Glu Ala Xaa. Gly Thr Arg Leu Leu He Asn Gly Arg Thr Cys Leu 
435 440 445 

Leu Gin Lys Glu His Asp Pro Ser Arg Leu Leu Ala Asp Thr Pro Cys 
450 455 460 

Lys Leu Leu Arg Phe Leu Val Ala Asp Xaa Ser His Val Val Ala Asp 
465 470 475 480 

Thr Pro Tyr Ala Glu Val Glu Ala Met Lys Met 
485 490 

(2) INFORMATION FOR SEQ ID NO: 110: 
(i) Sequence characteristics: 



(C) Strandedness: Single 

(D) Topology? Linear 

(ii) Molecule type: Oligonucleotide 
(xi) Sequence Description: SEQ ID NO: 110: 

TCTAGACTTT AACGAGATTC GTCAACTGCT GACAACTATT GCACAAACAG ATATCGCGGA 60 

AGTAACGCTC AAAAGTGATG ATTTTGAACT AACGGTGCGT AAAGCTGTTG GTGTGAATAA 120 

TAGTGTTGTG CCGGTTGTGA CAGCACCCTT GAGTGGTGTG GTAGGTTCGG GATTGCCATC 180 

GGCTATACCG ATTGTAGCCC ATGCTGCCCA ATCTCCATCT CCAGAGCCGG GAACAAGCCG 240 

TGCTGCTGAT CATGCTGTCA CGAGTTCTGG CTCACAGCCA GGAGCAAAAA TCATTGACCA 300 

AAAATTAGCA GAAGTGGCTT CCCCAATGGT GGGAACATTT TACCGCGCTC CTGCACCAGG 360 

TGAAGCGGTA TTTGTGGAAG TCGG CGATCG CATCCGTCAA GGTCAAACCG TCTGCATCAT 420 

CGAAGCGATG AAAAUG 436 



WO 94/08016 PCT/US93/09340 

\ 

114 

(2) INFORMATION FOR SEQ ID NO: 111: 

(i) Sequence characteristics: 

(A) Length: 145 amino acids 

(B) Type: Amino acid 
<C) strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Peptide 

(xi) Sequence Description: SEQ ID NO: 111: 

Leu Asp Phe Asn Glu lie Arg Gin Leu Leu Thr Thr He Ala Gin Thr 
5 10 15 

Asp He Ala Glu Val Thr Leu Lys Ser Asp Asp Phe Glu Leu Thr Val 
20 25 30 

Arg Lys Ala Val Gly Val Asn Asn Ser Val Val Pro Val Val Thr Ala 
35 40 45 

Pro Leu Ser Gly Val Val Gly Ser Gly Leu Pro Ser Ala He Pro He 
50 ' 55 60 

Val Ala His Ala Ala Pro Ser Pro Ser Pro Glu Pro Gly Thr Ser Arg 
65 70 75 80 

Ala Ala Asp His Ala Val Thr Ser Ser Gly Ser Gin Pro Gly Ala Lys 
85 90 95 

He He Asp Gin Lys Leu Ala Glu Val Ala Ser Pro Met Val Gly Thr 
100 105 HO 

Phe Tyr Arg Ala Pro Ala Pro Gly Glu Ala Val Phe Val Glu Val Gly 
115 120 125 

Asp Arg He Arg Gin Gly Gin Thr Val Cys He He Glu Ala Met Lys 
130 135 140 

Met 
145 
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(2) INFORMATION FOR SEQ ID NO: 112: 
(i) Sequence characteristics: 

(A) Length: 22 base units 

(B) Type: Nucleic acid 

(C) Strandedness: Single 

(D) Topology: Linear 

<ii) Molecule type: Oligonucleotide 

(ix) Features 

(A) NAME/KEY: N 

(B) LOCATION: 11, 14 

(C) IDENTIFICATION METHOD: N « A f G, C, T 
(xi) Sequence Description: SEQ ID NO: 112: 

TCGAATTCGT NATNATHAAR GC 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) Sequence characteristics: 

(A) Length: 22 base pairs 

( B ) Type : Nucleic acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Oligonucleotide 

(ix) Features 

(A) NAME/KEY: N 

(B) LOCATION: 17 

(C) IDENTIFICATION METHOD: N = A, G, C, T 
(xi) Sequence Description: SEQ ID NO: 113: 

GCTCTAGAGK RTGYTCNACY TC 



(2) INFORMATION FOR SEQ ID NO: 114 : 

(i) Sequence characteristics: 

(A) Length: 21 base pairs 

(B) Type: Nucleic acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Oligonucleotide 
(xi) Sequence Description: SEQ ID NO: 114: 

GCTCTAGAAT ACTATTTCCT G 



WO 94/08016 



PCT/US93/09340 



116 

(2) INFORMATION FOR SEQ ID NO: 115: 

(i) Sequence characteristics: 

(A) Length: 22 baBe pairs 

(B) Type: Nucleic acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: oligonucleotide 

(ix) Features 

(A) NAME/ KEY : N 

(B) LOCATION: 10, 20 

(C) IDENTIFICATION METHOD: N = A, G, C, T 
(xi) Sequence Description: SEQ ID NO: 115: 

TCGAATTCWN CATYTTCATN RC 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) Sequence characteristics: 

(A) Length: 23 base pairs 

(B) Type: Nucleic acid 

(C) Strandedness: Single 

(D) Topology: Linear 

(ii) Molecule type: Oligonucleotide 
(xi) Sequence Description: SEQ ID NO: 116: 

GCTCTAGAYT TYAAYGARAT HMG 
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WHAT J$ CLAIMED IS: 

1. An isolated and purified polynucleotide of from about 
1350 to about 40,000 base pairs that encodes a polypeptide having the ability 
to catalyze the carboxylation of a bioun carboxyl carrier protein of a 
cyanobacterium. 

2. The polynucleotide according to claim 1 wherein said 
polypeptide is a subunit of acetyl-CoA carboxylase and participates in the 
carboxylation of acetyl-CoA. 

3. The polynucleotide according to claim 1 wherein said 
cyanobacterium is Anabaena or Synechococcus. 

4. The polynucleotide according to claim 3 wherein said 
biotin carboxyl carrier protein includes the amino acid residue sequence 
shown in SEQ ID NO:lll or a functional equivalent thereof. 

5. The polynucleotide according to claim 1 wherein said 
polypeptide has the amino acid residue sequence of Figure 1 or Figure 2. 

6. The polynucleotide according to claim 1 that includes 
(a) the DNA sequence of SEQ ID NO:l from about nucleotide position 1300 
to about nucleotide position 2650; (b) the DNA sequence of SEQ ID NO: 1 ; 
or (c) the DNA sequence of SEQ ID NO:5. 

7. An isolated and purified polynucleotide of from about 
480 to about 40,000 base pairs that encodes a biotin carboxyl carrier protein 
of a cyanobacterium. 

8. The polynucleotide according to claim 7 wherein said 
cyanobacterium is Anabaena. 
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9. The polynucleotide according to claim 8 wherein said 
biotin carboxyl carrier protein includes the amino acid residue sequence of 
SEQ ID NO: 1 1 1 or a functional equivalent thereof. 

10. The polynucleotide according to claim 7 that includes 
the DNA sequence of SEQ ID NO: 110. 

11. An isolated and purified DNA molecule comprising a 
promoter operatively linked to a coding region that encodes a polypeptide 
having the ability to catalyze the carboxylation of a biotin carboxyl carrier 
protein of a cyanobacterium, which coding region is operatively linked to a 
transcription-terminating region, whereby said promoter drives the 
transcription of said coding region. 

12. An isolated and purified DNA molecule comprising a 
promoter operatively linked to an coding region that encodes a biotin 
carboxyl carrier protein of a cyanobacterium. 

13. An isolated and purified polynucleotide of from about 
1500 to about 150,000 base pairs that encodes a plant polypeptide having the 
ability to catalyze the carboxylation of acetyl-CoA. 

14. The polynucleotide according to claim 13 wherein said 
plant polypeptide is a monocotyledonous plant polypeptide. 

15. The polynucleotide according to claim 14 wherein said 
monocotyledonous plant is wheat, rice, maize, barley, rye, oats or timothy 
grass. 

16. The polynucleotide according to claim 13 wherein said 
plant polypeptide is a dicotyledonous plant polypeptide. 
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17. The polynucleotide according to claim 16 wherein said 
dicotyledonous plant is soybean, rape, sunflower, tobacco, Arabidopsis, 
petunia, canola, pea, bean, tomato, potato, lettuce, spinach, carrot, canola, 
alfalfa, or cotton. 

18. The polynucleotide according to claim 13 wherein said 
plant polypeptide includes the amino acid residue sequence of SEQ ID 
NO: 109. 

19. The polynucleotide according to claim 13 that includes 
the nucleotide sequence of SEQ ID NO:7. 

20. An isolated polypeptide having the ability to catalyze 
the carboxylation of a biotin carboxyl carrier protein of a cyanobacterium. 

21. The polypeptide according to claim 20 wherein said 
cyanobacterium is Anabaena or Synechococcus. 

22. The polypeptide according to claim 20 wherein said 
biotin carboxyl carrier protein includes the amino acid sequence of SEQ ID 
NO:lll. 

23. The polypeptide according to claim 20 having the 
amino acid residue sequence of Figure 1 or Figure 2. 

24. An isolated and purified biotin carboxyl carrier protein 
of a cyanobacterium. 



25. The protein according to claim 24 wherein said 
cyanobacterium is Anabaena. 
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26. The protein according to claim 25 including the amino 
acid residue sequence of SEQ ID NO: 111. 

27. An isolated and purified plant polypeptide having a 
molecular weight of about 220 KD, dimers of which have the ability to 
catalyze the carboxylation of acetyl-CoA. 

28. A process of increasing the herbicide resistance of a 
monocotyledonous plant comprising transforming said plant with a DNA 
molecule comprising a promoter operatively linked to a coding region that 
encodes a herbicide resistant polypeptide having the ability to catalyze the 
carboxylation of acetyl-CoA, which coding region is operatively linked to a 
transcription-terminating region, whereby said promoter is capable of driving 
the transcription of said coding region in a monocotyledonous plant. 

29. The process according to claim 28 wherein said 
polypeptide is an acetyl-CoA carboxylase enzyme. 

30. The process ccording to claim 29 wherein said acetyl- 
CoA carboxylase is a dicotyledonous plant acetyl-CoA carboxylase enzyme. 

31. The process according to claim 30 wherein said coding 
region includes the DNA sequence of SEQ ID NO: 108. 

32. The process according to claim 28 wherein said 
promoter is CaMV35. 



33. A transformed plant produced in accordance with the 
process of claim 28. 



I 

I 
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34. A transgenic plant having incorporated into its genome 
a transgene that encodes a dicotyledonous polypeptide having the ability to 
catalyze the carboxylation of acetyl-CoA. 

35. A process of altering the carboxylation of acetyl-CoA 
in a cell comprising transforming said cell with a DNA molecule comprising 
a promoter operatively linked to a coding region that encodes a plant 
polypeptide having the ability to catalyze the carboxylation of acetyl-CoA, 
which coding region is operatively linked to a transcription-terminating 
region, whereby said promoter is capable of driving the transcription of said 
coding region in said cell. 

36. The process according to claim 35 wherein said cell is 
a cyanobacterium or a plant cell. 

37. The process according to claim 35 wherein said plant 
polypeptide is a plant acetyl-CoA carboxylase enzyme. 

38. The process according to claim 37 wherein said plant 
acetyl-CoA carboxylase enzyme is a monocotyledonous plant acetyl-CoA 
carboxylase enzyme. 

39. The process according to claim 38 wherein said 
monocotyledonous plant acetyl-CoA carboxylase enzyme is wheat acetyl- 
CoA carboxylase enzyme. 

40. A transformed cyanobacterium produced in accordance 
with the process of claim 36. 



41 . A process for determining the inheritance of plant 
resistance to herbicides of the aryloxyphenocypropionate or 
cyclohexanedione class, which process comprises the steps of: 
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(a) measuring resistance to herbicides of the 
aryloxyphenocypropionate or cyclohexanedione class in a parental plant line 
and in progeny of said parental plant line; 

(b) purifying DNA from said parental plant line and said 

progeny; 

(c) digesting said DNA with restriction enzymes to form 
DNA fragments; 

(d) fractionating said fragments on a gel; 

(e) transferring said fragments to a filter support; 

(0 annealing said fragments with a labelled RFLP probe 
consisting of a DNA molecule that encodes acetyl-CoA carboxylase or a 
portion thereof; and 

(g) detecting the presence of complexes between said 
fragments and said RFLP probe; and 

(h) correlating the herbicide resistance of step (a) with the 
complexes of step (g) and thereby the inheritance of herbicide resistance. 

42. The process according to claim 41 wherein said acetyl- 
CoA carboxylase is a dicotyledonous plant acetyl-CoA carboxylase enzyme. 

43. The process according to claim 41 wherein said acetyl- 
CoA carboxylase. is a mutated monocotyledonous plant acetyl-CoA 
carboxylase that confers herbicide resistance or a hybrid acetyl-CoA 
carboxylase comprising a portion of a dicotyledonous plant acetyl-CoA 
carboxylase^ a portion of a dicotyledonous plant acetyl-CoA carboxylase or 
one or more domains of a cyanobacterial acetyl-CoA carboxylase. 

44. A process for identifying herbicide resistant variants of 
a plant acetyl-CoA carboxylase comprising the steps of: 

(a) transforming cyanobacteria with a DNA molecule that 
encodes a monocotyledonous plant acetyl-CoA carboxylase enzyme to form 
transformed cyanobacteria; 
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(b) inactivating cyanobacterial acetyl-CoA carboxylase; 

(c) exposing said transformed cyanobacteria to a herbicide 
that inhibits acetyl-CoA carboxylase activity; 

(d) identifying transformed cyanobacteria that are resistant 
to said herbicide; and 

(e) characterizing DNA that encodes acetyl-CoA 
carboxylase from the cyanobacteria of step (d). 
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^ AGC - - J^TATITTGCCATTTCTAGAACTTAGCTGCATCGGCCCCAAGTATTTTGTCAAATATG 90 
. AAAGG i ^GACCGTGATGCCAAAACAGGTAATGGCGACCCCAGAAAGGCCCATCCA^ 90 

HI AA T AAA J ACCCCGCACATCCCGATAC ^ 270 

^J^^rcGCTGj^TTCCCACCATG^ 360 

CGC TGGAACAGATTGGATTAAATCCGGCGCACTATCTAAATCCAAACCAATCAATGAC ATATC TC AC C 4 50 

AGTAAGTAATTCTAAATGCCTTGTGOTTGAGCCATCACCTAAGAGTAGTAGTTGCCA^ $* Q 

:I™n A ^ T £H CCC ^ CCTT ^ CAAATAGGAGTGA ^ GArac ^ 630 

AG i CAArcTTCTTTACAAAAGTTCACCTATTTATATCAAAGCATAA 720 

CCCCCTGCCCCCTACTTCCCTCCTCTGCCCWAATm^ 9 1 q 

I GAG i A ± AG I GACTAAA ^ rcCG ^ 900 

oCCACATCCGCACGGGITGTACAAGAAGATATACTAGCACAAAAAAATTGCATAAAACAAGGTAAAAC 990 

AAATTTATCTTGCTAAATATACAAATTTCCCGAAGAGGATACGAGA^ 1 Q80 

TGGGGGTTTTCTTCCCTTACACCCTT^^ ~ 1110 

AAGATGAGCCTGGGGTATCTCCTGTCATAATTTGAGATGAAGCGATGCCT 12 60 

„ mfc _ „ _ "DEAMPKAATLRAKSNLDGR 

G«C«Ai 1 iCiATCTGCTGGTACTGATACTGATATCGAAAACTAGAAAATCAAGTTTGACAAAA 1350 

QFLSAGTDTDIEN'KMKFDKI LIANRGEIA 
CGCTGCGCATTCTCCGCGCCTGTGAGGAAATGGGGATTGCGACGATCGCAGTTC^ 1440 

LRI LRACEEMGIATIAVHSTVDRNALHVQL 
TTGCTGACGAAGCGGTTTGTATTGGCGAACCTGCTAGCGCTAAAAGTTATTTGAATA 1530 

A D E AVCIGEPASAKSYLNIPNI IAAALTRN 

ATGCCAGTGCTATTCATCCTGGGTATXKXTTITTTATCTGAAAATGCCAAATTTGCG 1 620 

ASAIHPGYGFLSENAKFAEICADHHIAFIG 
GCCCCACCCCAGAAGCTATCCGCCTCATGGGGGACAAATCCACT^ 1710 

* E * ^ RLMG D KSTAKE TMQKAGV P TVPGS 
GTGAAGGTTTGGTAGAGACAGAGCAAGAAGGATTAGAACTGGCGA^ lBQ0 

cGLVETEQEGLELAKDIGYPVMI KATAGGG 
GCG^CO*XnATGCGACTGGTGCGATCGC 1890 

" J^A-?.* 5 PDEFVKLFLAAQGEAGAAFGN 
ATGCTGGCGTTTATATAGAAAAATTTATTGAACGTCCGC 1 9 B 0 

AGVYJ.EKFIERPRHIEFQILADNYGNVIHL 
TGGG TG AGAGGGATTGC TCAATTCAGCGTCG TAACCAAAAG TTAC TAGAAGAAGCCCCCAGCCCAGCCTTGCACTCAGACC TAAGGGAAA 2070 

G E RDCS IQRRNQKLLEEAPSPALD SD LREK 
AAATGGGAGAAGCGGCGGIGAAAGCGGCTCAGTTTATCAATTACGCCGGGGCAGGTACTATCG^ 21 60 

" ^1^? A A A A Q F INYAGAGTIEFLLD RSGQF 

TTTACTTTATGGAGATGAACACCCGGATTCAAGTAGAACATCCCGT 2250 

YFMtMNTRIQVEHPVTEMVTGVDLLVEQIR 
GAATTGCCCAAGGGGAAAGACTTAGACTAACTCAAGACC^ 2 340 

iAQGtRLRLTQDQVVLRGHAIECRINAEDP 
CAGACCACGATTTCGGCCCAGCACCX^ 2430 

DHDFRPAPGRT S G Y LP PGGPGVR I D S H V Y ■** 
CGGATTACCAAATTCCGCCCTAGTACGATTCCTTAATTGGTAAATTGATCGTTTGG 2 520 

^YQIPPYYDS LIGKLIVWGPDRATAINRMK 
AACGCGCCCTC^GGGAATGCGCCATCACTGGATTACGTACAACCATTGGGT^^ 2610 

R A i^RECA I TG L PTT IGFHQR IME N 3 QF L — Q G 
GTAATGTGTCTACTAGTTTTGTGCAGC^ 270O 

NVSTSrVQ£MMK« *WVMGNRVSr7NY0 
AATTCCCTAACTCATCCGTGCCAACATCGTCAGTAArcCTTGCTGGCCT 2790 

r PNSSVPTSSVI.LAGLEELLATG * 

^J^^^JJC^ 2880 

GG £^ CG ~I£*^ G ^^ 2970 

AA J C £ AAACGC ^ GOTAACAC ^ 306 0 
AG-TT 
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ATGCGTTTCA ACAAGATCCT GATCGCCAAT GGGGGCGftAA T03CCCTGCG CAITCTCCGC 
ACTTGIGAAG AACTCGGGAT CGGCACGATC GCCGTTCACr CCACTGrTGGA TCGCAACGCG 
CTCCATGTGC AGTTAGCGGA CGAAGCGGTC TGTATTGGOG AAGCGGCCAG CAGCAAAAGC 
TATCTCAATA TCCCCAAGAT CATTGCGGCG GCCCTGACCC GTAATGCCAG CGCCATTCAC 
CCCGGCTA1G GCTTCTTQGC GGAOATGCC CGCTTTGCAG AAATCTGCGC OGATCAQCAT 
CTCACCTTTA TTGGCCCCAG CCCOSTTCG AXTCGAGGCA TG3G0GATAA ATCCACOGCT 
AAGGAAACAA TGCAGCGGCT CGGCGTTOCG ACGATTOOQG QCSCTGAOGG TCTGCTGAGG 
GATGTTGATT CGGCTGCCAA AGTTQCTGCC CSGATCGGCT ATGZCGTCAT GATCAAAGCG 
AOQQ0GQQQG GCGCTGGTCG CGGIATGCGG CTGGTGOCTG AGCZTTGCAGA TCTGGAAAAA 
CTGTICCTTG CTGCCCAAGG AGAAGOCGAG GCAGCTTTTG GCSATOCAGG ACTGTAICTC 
(3AAAA1TTA ICGATCGCCC ACGCCACGTT Q^TTTCAGA TCTTQGCCGA TGCCTACGGC 
AATGTAGTGC ATCTAGGCGA GCGCGATTGC TCCATTCAAC GICG'TCACCA AAAGCTGCTC 
C^AGAAGCCC CCAGTOCGGC GCTATCGGCA GACCTGCGGC AS^AATGGG GGATGCCGCC 
GTCAAAGTCG CTCAAGCQVT CGGCTACATC GGTGCCGGCA COETQGAGIT TCTGGTOai 
GCOCCGGCA ACTTCIACTT CATGGAGATG AATACCCGCA TCCAAGTCGA GCATCCAGTC 
ACAC^AATGA TTACGGGACT GGACTTGOT GCGCStfSCASl TTCGG&1TGC CCAAGGCG^A 
GCGCTGCGCT TCCGGCAAGC CGAIATTCAA CTGCGCGGCC ATGCCS^TCGA ATGCCGTATC 
AATGCGS^AG ATCCGGAAIA CAATTTCCGG CCGAATCCIG GCCGCATTAC AGGCTATTIA 
CCGCCCGGCG GCCCCGGCGT TCGTGTCGAT TCCCAIGITT AEOXRCIA OGAAATTCCG 
CCCTATTACG ATTCGCTGAT TGGCAAAITG ATTGICTGGG GTGCAACACG GffiAGaGGOG 
ATCGCGCGC— TGC2GCGIGZ TCTGCGGCSA TGCGCCATCA CC^SCTTGCZ GACGACCCTT 
AGTTTCCAIC AGCTCS^TGIT GCAGATGCCT GAGTTCCTGC GC3GGGAACT CTAXACCAAC 
TTTGTTC3GC AGGTGATGCT ACCTCGC3TC CTCAAGTCCT AG 

amino acid, sequence 

MRFNEQZiIAN PGEIALRILR TCEEDGIGTI AVHSTVDRNA LHV2LADEAV ClffiAASSKS 
YINIPNIIAA ALTPNASAIH PGYGFLAENA RFAEICADHH LTFZGPSPDS IRAMSDKSTA 
KEIM2RVGVP TIPGSDGLLT DVDSAAKVAA EIG/PVMTKA TAGSG3K2-1R LVREPADLEK 
LFLAAQGEAE AAFGNPGLYL EKFIDRPRHV EFQILADAYG NWHZjGERDC SIQRRHQKLL 
HEAPSPALSA DLRQKM3DAA VKVAQAIGYT GAGTVEFLVD ATG3FYFMEM NTRIQVEHPV 
TEMITGLDLI AEQIRIAQGE ALPFFQADIQ LRGHAIECHI NAE^PEYNFR PNPGRITG¥L 
PPGGPGVRVD 3KVYTDYEI? PYYDSLIGKL IVW3ATREEA IAK-CPALRE CAITGLP1TL 
SrriQLiMI£24P EFLPGELYTN F7EQVMLPRI IKS 
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Wh ACC 100 

Rt ACC ^EPSPLAKTI^LNQHSPJTIIGSVSEDNSEDEIS-NLVKLDLEEKEGSLSPASVSSDTLSDIXJISALQDG 

Ch ACC xeessqpakfi^m^hsrfiigsvsednsedetsslvkldlixexep^^^ 

Yt ACC MSEESLFESSPQKMEYEITNYSERHTELPGHFICLNTVDKL 
Sy ACC 
An ACC 
EC ACC 

Hm PCCA MLSAALRTLKHVLYYSRQCL 
Rt FCCA , MPYRERFCAIRWCRNSGRSSQQLLWTLKRAPVYSQQCL 

Yt PC MS 



Wh ACC 

Rt ACC TVASPAEFVTRFGGNKVXEKVLIA^GIAAVKC^^IRRWSYEMFRK^ 

Ch ACC :VASPAEFVTRFGGNRVIEKVLIANNGIAAVKCMRSIPJ*WSYEM^^ 

Yt ACC EESPli^FVKSHGGHTVISKILIANNGIAAVFEIRSVRKW 

Sy ACC MRTNKILIANRGEIALRILRTCEELGIGTIAVHSTVD — RNALHVQLADEAVCIGEAASS KS YLNIPNI IAAALT 

An ACC MKFDKILIANRGEIALRILRACEEM3IATIAVHSTVD — RNALHVQLADEAVCIGEPASA KS YLNIPNI IAAALT 

Ec ACC MU3KIVIANRGEIALRILPACKELGIKTVAVHSSAD — RDLKHVLLADETVCIGPAPSV KSYLNIPAIISAAEI 

Hm PCCA MVSRNLGSVGYDPNEKTF^KILVANRGEIACRVIRTCKKrCIKTVAIHSDVD — ASSVHVKMADEAVCVGPAPTS KSYLKMDAIMEAIKK 

Rt PCCA WSR5LSSVEYEPKEKITDKILIANRGEIACRVIKTCRK>GIRTVAIHSDVD — ASSVHVKMADEAVCVGPAPTS KSYLNMDAIMEAIKK 

Yt PC QRKF^LRDNFNLLGEK-NKILVANBGEIPIRIFRTAHELSMQTVAIYSHED — RLSTHKQKADEAYVTGEVGQYTPV -GAYLAIDE IIS I AQK 



wh acc ; 

Rt ACC IP VQAVWAGWGHASENPK LPELL — LKNG I AFMGPPSQAMWALGDKIASS IVAQTAG IPTLPWSGSGLRVDWQEND FSKR I LNVPQD LYEKG YVKD VDD 

Ch ACC LPVQAVWAGWGHASENPK LPELL — KKNGIAFMGPPSQAMWAUmKIASSrVAOTAGIPTLPWNGSGIRVDWQEOTL^ 

Yt ACC ADVDAVWAGWGHASENPLI^EKl^QSKPJCViriGPPGNAMRSI/SKISSTrVAOSAKV^ — VDEKTGLVSVDDDIYQKGCCTSPED 

Sy ACC RNASAIHPGYGFLAENARFAEIC — ADHHLTFIGPSPDSIRAMGDKSTAKETMQKVGVPTIPGSDG-L LTDVDS 

An ACC RNASAIHPGYGFI^ENAKFAEIC~ADHHI^IGPTPEAIRIWGOKSTAKETTOKAGVPTVPGSEG-L — VETEQE 

Ec ACC TGAVAIHPGYGFLS ENANF AEQV — SRSGFIFIGPKAETIRLMGDKVSAIAAMKKAGVPCVPGSDGPL GDDMDK 

Hm PCCA rRAQAVHPG YGFLSENKEFARC L — AAED WFI GPDTHAIQAMGDKIESKLIiAKKAF/VNTIPGFDG-V VKDAEE 

Rt PCCA TGAQAVHPGYGFLSENKEFAKCL — AAEDVTFIGPDTHAIQAM3DKIESKIJJtf(RAKVNTIPGFDG-V -LKDAOE 

Yt PC HQVDFIHPGYGFLSENSFFADKV — VKAGITWIGPPAEVIDSVCTK^/SAR^^IJU\KANVPTVPGTPG-P ~ IETVEE 



Wh ACC VMIKASWGGGGKGIRKVHNDDEVRALFKQVQ6EVPGS PIFIMKVASQSRHLEVQLLCDKHGNVAALHSRDCSVQRRHQKIIEEG 400 

Rt ACC GIJCAAEEVGYPVMIKASEGGGGKGIRKVNNADDFPNLFRQVQAEVPGS P IFVMRLAKQSRHLEVQ I LADQYGNAIS LFGRDCS VQPRHQKI IEEA 

Ch ACC GLRAAEEVGYPVMIKASE^GGGKGIRKV^ADDFPNLFRQVQAEVPGS P IFVMRIJUCQSRHLEVQILADOYGNAIS LFGRDCSVQRRHQKI IEEA 

Yt ACC GLQKAKRIGFPVMIKASEGGGGKGIROVEREEDF I AL YHQAANE IPGS P IFIMKLAGRARHLEVQLLADQYGTNIS LFGRDCS VQRRHQKI IEEA 

Sy ACC AAKVAAEIGYPWTXATAGGGGRGMRLVREPADLEKIJ^ 

An ACC GLELAKDIGYPVMIKATAGGGGRGMPXVRSPDFJT/KIf I<AAQGEAGAAFG 

Ec ACC NRAIAKRIGYPVIIKASGGGGGPXMRVVRGDAEIJVO^ISMIRAEAKAAFSNDMVY^ 

Hm PCCA AVRIAREIGYPVMIi»SAGGGGKGWIAWDDEETRDGFRl£SQEAASSFGDDRLLIEKFIDNP^ 

Rt PCCA AVRI ARE IGYP VMIKAS AGGGGKGMR I PWD DEETRDGFRFSSQEAASSFGDDRLLIEKFIDNPRH IEI QVLG DKHGNALWLNERECS IQRRNQKWEEA 

Yt ?C ALDFVNE YGYPVI IKAAFGGGGRJ5^V\/REGDD VADAFORATSEARTAFGNGTCFVERFLDKPKH IEVQLIADNHGIJVVHLFERDCSVORRHQKWEVA 



Wh ACC PITVAPPETI^I£QAARRLAKCVQYQGAATVEYLYSMETGEYYFI£UJPRWVEOT 500 

Rt ACC PAAIATPAVFEHMEKAVKIAKWGYVSAGTVEYLYSQDH3SFYFI£H«PR^ 

Ch ACC PASIATSWFEH>£QCAVKIAKMVGYVSAGTVEYLYSQDH3SF^ 

Yt ACC PWIAKAETFHEMEKAAVRI^KLVGWSAGTVE YLYSHDDGKFYFLELNPRWVEHPTTEMV IRTLYGMNPHSASE 

Sy ACC PSPAI^ADLRQKMSDAAVKVAOAIGYIGAGTVEFLVD-ATtaiFYFMEMNTRlOVEHP 

An ACC PSPALDSD LRERMGQAAVXAAQF INY AGAGTIEFLIJ)-RSGQFYFKEMNTRTQVEHP\/TEMVTGVDLLVEQIRIAQGE^ 

Ec ACC PAPG ITPELRRYIGERCAKACVD IG YRGAGTFEPLF'-^NGEFYFIEMNTRIQVEHPVTEMITGVDLIKEQMRIAAGQP LS IKQEEVH 

Hm PCCA PSIFLDAETPJWCEQAVALARAVKYSSAGTVEFLVDSK-KNFYFI^MN^ 

Rt PCCA PSIFIIJPETRRAMGEQAVAWPKAVKYSSAGTVEFLVDSQ-KNFYFIIiMNra 

Yt PC PAKTLPREVRDAILTOAVKUVXECGYRNAGTAEFLVDNQ-NRHYFIEINPRIQVEHTITK 



Wh ACC AWKEISAVATKFDLDKAOSVKPKGHCVAVRVTSEDPDDGFK-PTSGRVEELNFKSKPNVWAYF SVKSGGAIHEFSDSQFGHVFAFGESRSLAIAN 600 

Rt ACC IDFENSAHVPC PRGHVIAARITSENPDEGFK-PSSGTVOELNFRSNKNVWGYF S VAAAGGLHEF AD SQFG HCF SWGENREEAISN 

Ch ACC IDFENSAHVPC PRGHVIAARITSENPDEGFK-PSSGTVOELNFRSNKNVWGYF SVAAAGGLHEFADSOFGHCFSWGENREEAISN 

Yt ACC IDFEFKTQDAT KKORRP IPKGKCTACRITSEDPNDGFK-PSGGTLHELNFRSSSNVWGYF SVGNNGNIHSFSDSQFGHIFAFGENRQASRKH 

3v ACC LRGHAIECRI>iAEDPEYNF-RPNPGRITG— YLPPGG-PGVRVDS-HVYTDYEIPPYYDSLICKLIVWGATREEAIAR 

*An ACC LRGHAIECRINAEDPDHDF-RPAPCRISG— YLPPGG-PGVRIDS-HVYTDYOIPPYYDSLIGKLIVWGPDRATAINR 

Ec ACC VRGHAVECRINAEDPN-TF- LPSPGKITR-— FHAPGG-FCVRWES -H I YAG YTVPPYYDSMIGKLICYGENRDVA I AR 

Hm PCCA FNGWAVECRVYAEDPYKSFGLPSIGRLSO — YQEPLHLPGVRVDS-GIQPGSDISIYYDPMISKLITYGSDRTEALKR 

Rt PCCA ISGWAVECRVYAEDPYKSFGLPSIGRLSO--YQEPIHIJCVP^S^IQPGSDISIYHDPMISKLVTYGSDRAEALKR 

Yt PC TRGFAICCRITTEDPAKNFO-PDTGRIEV^YP^AGG-NGVRLDGGNAYAGTIISPHYDSMLVKCSCSGSTYEIVRRK 
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Rt ACC MW ALKE LSIRSOE 'KIT VE YL IKLLE TES FO Lnh T rvrcw TilB T. TaP.m/n itod nTMT nwrr & t m/AnuMT umc t c mp t uct rt3rftmn« 
Ch ACC 

Yt ACC ... . . u , aJ t,. J .r~.Ji ,\. i it. i tUsi i J I 'I.WIJUJI, 1THKNT1AKK 

Sy ACC MORAIJ^CAITG-LPTTl^HQI/lLQMPEFIJlCSLyTKrVEQVMIPRILKS 

An ACC MKRAIJ^CAITG-IJTTIGFHQRIMENPQFLCjGNVSTSFVQEMNK 




1 1 KHSS™ 800 

Ch ACC ^"™}£LraGRKYVIXVTRQSPNSr^^ 

An ACC TOIAEVT^DDFELT^VGVNNSVVPVVT^ 

Kp ODA VLTVAI^QPGIKFISNRHNPAAFEPVPQAE^ 

PS TC MEOJCVTVNCTAYDVDVDVDKSHENP^ 



Wh ACC LADTPCKLLRFLVADGSHWADTPYAEVEAMKM 
E ACC R^A^SJ^ 

Ps TC PAPI^TVSKILVKEGDTVKA»yiVLVI£AMK>IE^ 

* * • » •*» - 
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Biotin carboxylase / biotin carboxyl carrier domain primers 



Biotin carboxylase domain Biotin carboxyl carrier domain 




A 



V l\l I\ L'i A 

5 -GCTCTAGAATACTATTTCCTG-3 ' 3 . -CRNTACTTYTACNWCTTAAGCT-5 ' 

Primer 3 Primer 4 
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rnTTFvrnflTTTA &gipiTCAro&sGrG^ 90 

VMIKASWGGGCKGIRKVHNDDEVRALFKQV 

CAACCAGAAGTCCCCCCATCCCCTAT MlTAr i A TGAAGCT^^ 180 
OGE VPGSP IF IMK VASQSRH LEVQLLCQKH 

GCCAACGTCCCAOCACIGCACACTCGAGACTGTA U IC1 1L. AAAGAAGGCATCAAAAGATCATTGAGGAGGGACCAAITACA (j11LiL1LU1 270 
CNVAALHSRDCSVQRRHQKI I E E G P I T V A P 

CCAGAAACAATTAAAGACCTICAGCACCCGGCAAGGCCACTAGCTAAA i tj 1 01 GL AATATCAGCGTGCTGC'DVCAGTCGAAT AIl. IV, 1A C 360 
PETIKELEQAARRLAKCVQYQGAATVEYLif 

SMETGEYYFLELNPRLQVEHPVTEWIAEIN 
C 

T 

TTACCTGCATCTCAAGTTGTAGTAGGAAlXWGCAroCCACTCTACAACAT^ 540 
LPASQVVVGMG I P L Y N IPE IRRFYG IEHGG 

c c c 

C C G 

GGCIAIUUU. I IUGAAGGAAATATCAGC I*G l'l^CAACTAAAlTlUArril^GACAAAGCACAG TCTG IAAAGCCAAAAG GTCATTGTGTA 630 

GYHAWKE I SAVATKFD LDKAQSVKP KG HCV 

A G 
A G 

GCAGTXAGAUITALIAGCGAGGATCCAGATGATGGCT^^ 720 

AVRVTSEDPDDG F KPTSGRVEELNFKSKPN 

C G C C 

T A T T 

GTnCGCCCTttTXCTCCGTPtfUiT CCG GACGTGCAAnCJCCJ tfSI IC ICI GlCTIPCO l CIMOBIC ft rGITH T OCT T TTGC BBAMCT 810 

R S 
VWAYFSVKSGGAIHEFSDSQFGHVFAFGES 

T 

A 

AGGTCArTQGCAAIACCCAAT All!»C.l AL 1 rOGG TlMUWGAGATOCAAAITCGTCGMIUSKDiOGCRCZAAI G HI GA C lA CACTCTGCAT 900 

RSLAIANMVLGLKEIQIRGEIRTNVDYTVD 

A T C 
A AT 

CTCriCAATGCTGCAGAGTACCCAGAAAATAlGATTCAC ^ ^ 990 

K 

LI.NAAE YRENMI HTGHLOSRI AMRVRAERP 



C 
T 

AAJUSGTOttATACCTCCAAAGCACATClCT 1170 

KGQIPPKH IS LVN'LTVTLNIDGSKYTIETV 

A CG C 

A CG T 

CGAG GTGGAC CCCG TAGCTACARATT AAGAATTAATCaATCAGACjli ITUAGGaUSAGATACAlTrCClT; CUAGATGGCGGACTCTTAATG 1260 

S P 

RGGPRS YKLR INESEV EAE I HFLRDGGLLM 

T G 
C T 
CAGTTGGAIGGAAACAGTavrGTAATrrACGCCGAGACAGAA GC lX ICT 1350 

s s 

QLOGNSHV I YAETEAAGTRLl. INGRTC LLQ 




ACGCCATATGCCGAGGTGGAGGCCflTnRft BftTH 
TPYAEVEAMKM 
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Biotin carboxyl carrier protein primers 
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TCTAGflCTTTAACGAfiftT^fiTCAACTGC^ 99 
LD FNEIRQLLTTIAQTDIAEVTLKSDDrZL 

MCGGTGCGTAMGCTGTTGGTGTGAATAATAGTGTTGTGCCGGm 1 BO 

TVRKAVGVNNSVVPVVTAPLSGVVGSGLPS 

GGCTATACCGATTGTAGCCCATGCTGCCCCATCTCCAT^ 270 
A I P IVAHAAP SPSPEPGTSRAADHAVTSSG 

CTCACAGCCAGGAGCAAAAATCATTGACCAAAAATTAGCAGAAGTGGCTTCCCCAATGGT^ 3 GO 

SQPGAKI idqklaevaspmvgtfyrapapg 

TGAAGCGGTATTTGTGGAAGTCGGCGATCG^ 
EAVFVEVGDRIRQGQTVCI IEAMKM 
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